Entries Tagged 'Technical improvements' ↓

Improvements in novoseek – March 2010

[Connotea] [del.icio.us] [Digg] [diigo] [Google] [LinkedIn] [Reddit] [StumbleUpon] [Email]

There have been several major improvements this month in novoseek:

  • Select the Publication Type from the Advanced Search panel
  • Users have been asking for it and it is now available when you are on the Advanced Search panel.

    TIP 1: Hold Ctrl (control) to select several Publication Types

    TIP 2: Learn more about the different Publication Types and their use when looking for scientific publications.
  • Complete authors list for each article
  • With a view to providing you with more information about the authors of an article, we have updated the meta data of every publication with the complete list of authors.

    - In the search results page, you will see the two first authors and the last author of the publication. Check with a search example

    TIP: when you are looking for a specific author, this author will appear highlighted within the results and you will see 4 authors in total for every publication (the 3 mentioned previously + the author you are looking for and highlighted within the results) Check with this example for Eley Robert

    - In the detail page of an author, all of the authors are now listed. Check authors in a publication detail page

  • Disambiguation of authors
  • A common problem within the scientific literature is the broad range of text formating that has an influence on authors name too. Sometimes an author name is written with Last name, First name or Last name, initial First name, etc. We now index all the known aliases of an author to make searches for an author publication more comprehensive. Check direct example of author disambiguation

  • Better navigation from one search results page to another
  • Users suggested to give a more intuitive navigation menu at the bottom of search results pages to switch from one page to another. This is done!

From The Cloud

[Connotea] [del.icio.us] [Digg] [diigo] [Google] [LinkedIn] [Reddit] [StumbleUpon] [Email]

Some days ago, we finished the migration of our production site to the Cloud, more precisely to Amazon EC2. I do not know if the “Cloud Computing” needs defining, but in any case I invite you to watch this wonderful video made by the people of Salesforce.com, in which you can find an easy and intuitive definition of Cloud Computing, and a list of its benefits.

What benefits does provide the Cloud to a search engine like novo|seek offer? I am probably going to repeat most of the same arguments listed in the video, anyhow:

  • Cost reduction: in the Cloud we pay for what we use: CPU time, storage, bandwidth…
  • Easy scaling: for a growing search engine like novo|seek, scalability is critical. For us, the user experience is very important and thus the QoS (Quality of Service).  Dimensioning the servers of an emerging web site is a hard task. If you get short, any marketing or PR action that drives a lot of traffic to the site can get the servers down on their knees.  On the contrary, if you over-dimension your infrastructures, you will have your servers getting old inside your data center. Amazon EC2 let us re-dimension our production infrastructure at the same time our traffic grows.
  • Reduction of the Time-to-market and the entry barriers to innovation: The EC2 infrastructure lets us create new server instances fast and easily  with different sizes and performance. If we need to try new text mining algorithms or expand our technology to new data sources, the Cloud will allow us to instantiate all the required servers to meet our extra computing power and we can forget about finding new room in our crowded data center. Cloud computing lowers the innovation entry barriers to small and medium size companies like us.

We know that we are not the only Company in the sector to take advantage of Cloud Computing, BioTeam, for example, is adapting bioinformatics solutions so it can be run in Amazon EC2.

Not only small and medium companies are in The Cloud, big pharmas like Jonhson&Johnson or Lilly, are developing their first projects on EC2 although a recent report from McKinsey stated that Cloud Computing will not reduce costs to large corporations.

The novo|seek team is sure that moving to The Cloud will improve the quality of the service that we are currently providing, and will let us bring to all of you the innovative features cooking right know at our R&D department.

Greetings from the cloud!

Open access vs Free access

[Connotea] [del.icio.us] [Digg] [diigo] [Google] [LinkedIn] [Reddit] [StumbleUpon] [Email]

Plos open access logoWe have recently added to novoseek new articles from PubMed Central. This new feature provides the ability to access “full text publications” and we have noticed that there is quite some misunderstanding regarding what has actually been indexed. So let us explain it in detail.

Indeed, we have included the Open Access subset of PubMed Central. What is that? Well, Open Access is the free online access to research papers. Obviously, this definition has driven some confusion and misuse of the term “open” access as it is often considered a synonym to “free” access.

The first definition for open access came up at the Budapest Open Access Initiative which was later revised in Bethesda and Berlin. This led to what Peter Suber calls the BBB open access definition for which most of the Open Access Movement agreed on.
The Open Access definition stands around two ideas:

  • Free of charge accessibility
  • Tears down permission barriers

Consequently, these ideas make distribution, copying and derivative work production possible to anyone.

Interestingly, we’ve observed that most of the time, open access is used as a synonym to free access. This is not quite correct since open access goes beyond just free access to content. For a better understanding of the differences between them, have a look at the graphic below.

open-access

PubMed Central is a free peer reviewed digital archive of biomedical and life sciences literature developed and managed by the NIH. It gives free access to articles among which some are open access. As we have discussed in previous posts, the NIH public access policy has ensured the access to published results of NIH funded research. However it does not say whether it has to be through a free access or an open access policy.

In novoseek, we have analyzed with our text mining algorithms the full text of the open access subset and we have made it public. So now you will find full text articles in which you will be able to highlight all the relevant keywords, and enjoy the great features of our technology.

We hope you like this new data set and we will more than welcome your comments and suggestions.