Monday, 24 October 2016

How do I submit my index information into Lablink?

When accepting sequencing submissions in the Genomics Core, there may be instances where we have to contact you if there is an error with your submission form. The most common problems relate to index information. We have put together some instructions here that we hope should make things easier and help us to get started on your sequencing as soon as we can.

Please only follow these instructions if:
·        The index sequences you have used are visible in the index sequences tab of the submission form
·        There are fewer than 384 samples within your pool.
If the points above are not true, please see section, unspecified index further on in this blog.

1.      Completing the sample/reagent label field

1a. Navigate to the index sequences tab of the sample submission form.

1b. Search for your index sequences

1c. Copy the index name from column C of the index sequences tab, e.g A001-A005 to the column Sample/Reagent Label, of the submission form tab.

Figure 1-index sequences tab of the sample submission form

Figure2-submission form

      2. Completing the UDF/Index type field

2a. Select the correct UDF/Index type from the drop down menu on the submission form tab.
IMPORTANT –please make sure the Index type field matches column B of the index sequences tab. This ensures that your library goes through our acceptance step. Please see the two following examples. 

Example1- I am submitting a Truseq LT library consisting of 5 samples and used indexes A001-A005.
The sample/reagent label on the submission form should read A001-A005.
The UDF/index type should read Truseq LT.

 Figure 3 - The index type field next to these indexes is Truseq LT so this is what should be entered into the UDF/Index type field.



Figure 4- Submission form

In most cases, the Index type will match to the indexes you have used as expected. However there are now multiple kits available which share the same indexes.
Because of this, there may be some cases where the index sequences you select will have a different Index type to the library you have made. (see example 2 below) This may affect you if you are submitting for Nextera XT or Nextera.

Example 2- I used indexes N701-N501, N702-N501, N703-N501, N704-N501. I prepared the libraries using a Nextera XT library prep kit.

The sample/reagent label on the submission form should read N701-N501, N702-N501, N703-N501, N704-N501. The UDF/index type should read Nextera and not Nextera XT. This is because the UDF/Index type needs to match column B on the index sequences tab.


Figure 5 - The index type field next to these indexes is Nextera so this is what should be entered into the UDF/Index type field

 
3. Unspecified Index

If your pool has index sequences not present in the index sequences tab OR if you have a pool which is made up of more than 384 samples, you will need to submit as unspecified index.
In the submission form:
·        Sample/reagent label should read – unspecified
·        UDF/Index type should read –Unspecified (other)
You should submit your pool as one row on the form. Libraries submitted as unspecified index cannot be demultiplexed by the Genomics Core but we do have a demultiplexing guide on lablink which should give some useful information.

Important-since we have no index sequence information, please write in the comments section of the form the index lengths for Index 1 and Index 2. Without this information your sequencing may be delayed whilst we contact you to check these parameters.
Once you have submitted your libraries, the Genomics Core would like to start working on your sequencing as soon as we can.

If the incorrect index type has been selected, we will need to delete your submission and we would ask you to submit again after making changes to your sample sheet and following the instructions above. Of course whilst this guide should be used to help you, we are always here to discuss this with you in person if you have any questions. Alternatively you can contact us on our helpdesk: genomics-helpdesk@cruk.cam.ac.uk
 



Sunday, 9 October 2016

Recent papers that the Genomics Core has helped with

I like to highlight some of the really interesting work we've been involved with, or that has come out of the Institute from time to time, and I recently updated our lab home page with links to a couple of papers.  i thought I'd take the opportunity to write about them in a bit more detail here. Many of you will already know I run the Genomics Core facility at CRUKs Cambridge Institute. We do a lot of Illumina sequencing! The lab works on a huge number of projects for the research groups here in the Institute, and also across many groups in Cambridge via a long-running sequencing collaboration. We do do some R&D work in my lab, but >90% of our efforts are working with, or for, other research groups.
Highlights from the last years genomics research include work from the Caldas group who have completed three project over the lat year I've included here; 1) profiling of almost 2500 Breast Cancer patients for mutational analysis of 173 genes using a targeted pull-down (Pereira et al Nature Communications 2016); 2) cancer exomes from Murtaza et al,; 3) PDXs from Bruna et al.; and the Balasubramanian group who have shown that it is possible to capture and sequence double-strand DNA breaks (DSBs) in situ and directly map these at single-nucleotide resolution, enabling the study of DSB origin (Lensing et al. Nature Methods 2016). The rapid speed and unbiased nature of the genome-wide experiments being performed in the Institute, and often prepped and sequenced in the Genomics core continue to increase our understanding cancer biology.


Friday, 15 July 2016

Why is my HiSeq 2500 sequencing taking longer than usual

With the introduction of the HiSeq 4000 we're able to sequence faster and cheaper than ever before. But as we're transitioning the larger projects over to HiSeq 4000 a side-effect is fewer and fewer samples to run on HiSeq 2500; and as we're waiting for samples to fill the 8 lane flowcell that means longer wait times for you. We thought this post might help you determine if you still need to use HiSeq 2500, or if you can migrate over to HiSeq 4000. Most sequencing is taking under 2 weeks, but some people are now waiting up to one month for 2500 data.



Running a big RNA-seq project is easy(ish)

Last year we completed our largest ever RNA-seq project: 528 samples of TruSeq mRNA, 60 lanes of HiSeq 2500 SE50, 13 billion reads - and all in 16 weeks. Being able to do such a large project in such a short time and get high quality data from nearly all samples really demonstrates the robustness of RNA-seq. If you're thinking that a project larger than 96 samples might be too much to consider, then come and talk to us (and Bioinformatics) at a Tuesday afternoon experimental design meeting - and we'll convince you it can be a pretty smooth process.



We've been using Illumina's TruSeq mRNA-seq automated on our Agilent Bravo robot and the sequencing was done on HiSeq 2500, although we're currently  moving to HiSeq 4000.
  • 528 samples processed on six-plates of RNA-seq
  • QC lanes sequenced and analysed
  • 60 lanes of SE50bp sequencing in total, 10 lanes per plate
  • 12,918,018,345 PF reads for this project (215M reads per lane on average)
  • 24M reads per sample on average
  • 16 weeks from start to finish
This has been a large and complex project where we had lots of discussions along the way. I think that everyone involved has contributed to the success so far: the research group who asked us to do the project, my lab, and also our Bioinformatics Core. The ability to discuss the experiment at different stages, and to focus on QC issues as they arise really makes using the Cores a great place to do your projects.

Sunday, 7 February 2016

Our first paper on the bioRxiv

I just uploaded our paper, which has also been submitted to BioTechniques, onto the bioRxiv preprint server. The work we present comes from an idea I had shortly after first using Agilent's BioAnalyser in 2000. I was blown away by this piece of technology that has become the de facto standard for RNA QC, and has also pretty much replaced gel electrophoresis for DNA fragment analysis in NGS applications. When launched in 1999, it was the only microfulidics instrument for biology applications. The idea was a simple one: can bioanalyser chips be swapped between assays?

Friday, 20 November 2015

Following us on Twitter

The Genomics Core now has two Twitter accounts, you can follow me @CIgenomics (James Hadfield, Head of Genomics) and hear about things I think are interesting, but which you might not necessarily be interested in; and/or you can follow our sequencing queue @CRUKgenomecore which puts out live Tweets directly from the sequencing LIMS.



How does the LIMS Tweet: Some clever work by Rich in Bioinformatics has allowed us to pull out data directly from Genologics Clarity LIMs queue using a script run every 24 hours, and the Twitter API then allows that script to post messages on our behalf. Because of this the Tweets about our queue should happen every day and without manual intervention. Hopefully you'll be able to rely on these to give you a reasonable idea of how long you might have to wait for your sequencing results. Of course we can't predict what will happen with your particular sample so please treat the Tweet as a guide.

Tweets explained: The Tweets have a format that we hope is pretty intuitive, but we've described what all the bits of information mean below...


Thanks especially to Rich Bowers in the Bioinformatics core for pulling all of this together from a vaguely described idea by me.

Friday, 4 September 2015

Improving DNA and RNA quant with plate based fluorimetry

We quantify NGS libraries all the time and qPCR works brilliantly, but nucleic acids need to be handled differently. We don't actually run that much quantifiaction on DNA and RNA as most of our users have already done this; we asked them to do it so we could more efficiently run larger batches of library prep to keep costs down and turnaround times as short as possible. Over the last few years we've been running the Nextera exome preps and DNA quant has become more important than ever before, in fact we started running a secondary quant just to be certain about DNA concentration.

Most of the time DNA and RNA quant works well and we've favoured the fluorescent Qubit assay recommended by Illumina in their protocols. A nanodrop or plate reading spec at 260:280nM measures total nucleic acid and is confounded by ssDNA, RNA, and oligos so can give inaccurate results. We run the Qubit dsDNA BR Assay from Molecular Probes on the PHERAstar fluorescent plate reader (here's their handy protocol). We have only been using 1ul of DNA (Illumina suggest 2) for each sample but we run triplicate assays to get a high-quality quantitation.

Problems with the Qubit assay: Recently some users have reported problems with the accuracy of the QuBit assay on our plate reader and the manager of our Research Instrumentation Core helped us to get to the bottom of the issues and some excellent results. The main problem turned out to be addition of DNA into the working dye solution, it was the DNA coating the outside of the tips that appeared to be making the results so flaky. Changing the protocol to add DNA to the plate first fixed it and the results are looking great.

It ca also be very important to be certain which assay you should use; BR (Broad range) or HS (High Sensitivity). If you are working with low concentration nucleic acids then the HS assay is probably the one to use. For really accurate quant we'd suggest a quick QT check first, then normalisation of samples to about twice what you need; a second triplicate and robust quant will allow you to dilute the samples to the perfect working concentration.

Here are our top tips:
  • Add DNA to the measurement plate/tubes before anything else
  • Use a repeat pipette to make sure each well gets the same/right amount of dye solution
  • Shake the tubes/plate in the dark for at least 10 minutes (quant will be inaccurate if the dye has not intercalated properly, you can check your standard curve replicates to verify if this is an issue)
  • The triplicates really are worth the effort - especially if you're doing a Nextera prep