hackernews client

How to sequence your DNA for <$2k

234 pointsposted a day ago

(maxlangenkamp.substack.com)

101 Comments

teekert

19 hours ago

We are not “in the nanopore era of sequencing”. We are (still) firmly in the sequencing by synthesis era.

Yes it requires chopping the genome opening small(er) pieces (than with Nanopore sequencing) and then reconstructing the genome based on a reference (and this has its issues). But Nanopore sequencing is still far from perfect due to its high error rate. Any clinical sequencing is still done using sequencing by synthesis (at which Illumina has gotten very good over the past decade).

Nanopore devices are truly cool, small and comparatively cheap though, and you can compensate for the error rate by just sequence everything multiple times. I’m not too familiar with the economics of this approach though.

With sbs technology you could probably sequence your whole genome 30 times (a normal “coverage”) for below 1000€/$ with a reputable company. I’ve seen 180$, but not sure if I’d trust that.

the__alchemist

2 hours ago

I guess this depends on the applciation. For whole human genome? Not nanopore era. For plasmids? Absolutely.

I'm a nobody, and I can drop a tube into a box in a local university, and get the results emailed to me by next morning for $15USD. This is due to a streamlined nanopore-based workflow.

Metacelsus

19 hours ago

>you can compensate for the error rate by just sequence everything multiple times.

Usually, but sometimes the errors are correlated.

Overall I agree, short read sequencing is a lot more cost effective. Doing an Illumina whole genome sequence for cell line quality control (at my startup) costs $260 in total.

jefftk

5 hours ago

> We are (still) firmly in the sequencing by synthesis era.

It really depends what your goals are. At the NAO we use Illumina with their biggest flow cell (25B) for wastewater because the things we're looking for (ex: respiratory viruses) are a small fraction of the total nucleic acids and we need the lowest cost per base pair. But when we sequence nasal swabs these viruses are a much higher fraction, and the longer reads and lower cost per run of Nanopore make it a better fit.

bonsai_spool

19 hours ago

> But Nanopore sequencing is still far from perfect due to its high error rate. Any clinical sequencing is still done using sequencing by synthesis (at which Illumina has gotten very good over the past decade).

There is no reason for Nanopore to supplant sequencing-by-synthesis for short reads - that's largely solved and getting cheaper all the while.

The future clinical utility will be in medium- and large-scale variation. We don't understand this in the clinical setting nearly as well as we understand SNPs. So Nanopore is being used in the research setting and to diagnose individuals with very rare genetic disorders.

(edit)

> We are not “in the nanopore era of sequencing”. We are (still) firmly in the sequencing by synthesis era.

I also strongly disagree.

SBS is very reliable but it's common (if Toyota is the most popular car, does that mean we're in the Toyota internal combustion era? Or can Waymo still matter despite its small footprint?).

Novelty in sequencing is coming from ML approaches, RNA-DNA analysis, and combining long- and short-read technologies.

teekert

18 hours ago

I agree with you. Long reads lead to new insights and over time to better diagnoses by providing better understanding of large(r) scale aberrations, and as the tech gets better will be able to do so more easily. But is really not there yet. It’s mostly research and somehow it’s not really improving as much as hoped, I get the feeling.

celltalk

9 hours ago

This is wrong, a lot of diagnostic labs are actually going for nanopore sequencing since its prep is overall cheaper compared to alternatives. Also the sensitivity for related regions are usually matching qPCR, and it can give you more information such as methylation on top of that.

A recent paper on classifying acute leukemia via nanopore: https://www.nature.com/articles/s41588-025-02321-z/figures/8

The timelines are exaggarated but still it works and that’s what matters in diagnostics.

BobbyTables2

16 hours ago

I’ve always wondered how the reconstruction works.

It would be difficult to break a modest program into basic blocks and then reconstruct it. Same with paragraphs in a book.

How does this work with DNA?

MatrixMan

6 hours ago

You align it to a reference genome.

Its like you have an intact 6th edition of a textbook, and you have several copies of the 7th edition sorted randomly with no page numbers. Programs like BLAST will build an index based on the contents of 6 and then each page of 7 can be compared against the index and you'll learn that for a given page of 7 it aligns best at character 123456 of 6 or whatever.

Do that for each page in your pile and you get a chart where on the X axis is the character index of 6 and on the Y axis is the number of pages of 7 which were aligned there. The peaks and valleys in that graph can tell you about the inductive strength of your assumption that a given read is aligned correctly to the reference genome (plus you score it based on mismatches, insertions and gaps).

So if many of the same pages were chosen for a given locus, yet the sequence differs, then you have reason to trust that there's an authentic difference between your sample and the reference in that location.

There's a lot of chemical tricks you can do to induce meaningful non-uniformity in this graph. See ChIP-Seq for instance, where peaks indicate methyl markers which typically correspond with a gene that was enabled for transcription when the sample was taken.

If you don't have a reference genome then you can run the sample on a gel to separate the sequences of different length, that'll group by chromosome. From there you've got a much more computationally challenging problem, but as long as you can ensure that it's cut at random locations before reads are taken you can use overlaps to figure out the sequence, because unlike the textbook page example, the page boundaries are not gonna line up (but the chromosome ends are):

    Mary had a little
    was white as snow
    lamb whose fleece was
    Marry had
    had a little lamb
    a little lamb
    was white
    white as snow

So you can find the start and ends based on where no overlaps occur (nothing ever comes before Mary or after snow) and then you can build the rest of the sequence based on overlaps.

If you're working with circular chromosomes (bacteria and some viruses) you can't reason based on ends but as long as you have enough data there's still gonna be just one way to make a loop out of your reads. (Imagine the above example, but with the song that never ends. You could still manage to build a loop out of it despite not having an end to work from.)

nextaccountic

an hour ago

If you broke a string into overlapping blocks you could easily reconstruct it. The key here is that blocks form a sliding window on the string

If blocks were nonoverlapping then yeah the problem is much harder, akin to fitting pieces of a puzzle. I bet a language model still could do it though

jakobnissen

10 hours ago

There are two ways: Assembly by mapping and de Novo assembly.

If you already have a human genome file, you can take each DNA piece and map it to its closest match in the genome. If you can cover the whole genome this way, you are done.

The alternative way is to exploit overlaps between DNA fragments. If two 1000 bp pieces overlap with 900 basepairs, that's probably because they come from two 1000 regions of your genome that overlap by 900 baswpairs. You can then merge the pieces. By iteratively merging millions of fragments you can reconstruct the original genome.

Both these approaches are surprisingly and delightfully deep computational problems that have been researched for decades.

vintermann

12 hours ago

They exploit the fact that so much of our DNA is the same. They basically have the book with no typos, or rather with only the typos they've decided to call canonical.

So given a short sentence excerpt, even with a few errors thrown in, partial string matching is usually able to figure out where in the book it was likely from. Sometimes there may be more possibilities, but then you can look at overlaps and count how many times a particular variant appears in one context vs. another.

One problem is, DNA contains a lot of copies and repetitive stretches, as if the book had "all work and no play makes jack a dull boy" repeated end to end for a couple of pages. Then it can hard to place where the variant actually is. Longer reads helps with this.

bonsai_spool

16 hours ago

This is very easily googled. There are new algorithmic advances for new kinds of sequencing data but this is the key (from the 70s)

https://en.wikipedia.org/wiki/Burrows–Wheeler_transform

Danjoe4

14 hours ago

Nanopore is good for hybrid sequencing. You can align the higher quality illumina reads against its longer contiguous reads

Onavo

18 hours ago

You can get it pretty damn cheap if you are willing to send your biological data overseas. Nebula genomics and a lot of other biotechs do this by essentially outsourcing to China. There's no particular technology secret, just cheaper labor and materials.

vintermann

11 hours ago

Can you trust it though? It'd be trivially easy to do a 1x read, maybe 2x, and then fake the other 28 reads. And it'd be hard to catch someone doing this without doing another 30x read from someone you trust. There's famously a lot of cheating in medical research, it would be odd if everyone stopped the moment they left academia (there have been scandals with forensic labs cheating too, now that I think about it).

gillesjacobs

9 hours ago

They save money by cheap labour and batching large quantities for analysis. For the consumer this means long wait times and potentially expired DNA samples.

I tried two samples with Nebula, waited 11 months total. Both samples failed. Got a refund on the service but spent 50usd in postage for the sample kit.

Aurornis

20 hours ago

Interesting concept, but between the broken hardware and the way they gave up before getting anything useful this article was rather disappointing:

> Another problem was our flow cell was malfunctioning from the start — only 623 out of 2048 pores were working.

Is this normal for the machine? Is there a better write up somewhere where they didn’t give up immediately after one attempt?

homeless_engi

20 hours ago

Hi, believe it or not, I have actually done what the authors were attempting. I used saliva rather than blood as a source of DNA and extracted it using a Qiagen kit.

My Nanopore flow cell had nearly every pore working from the start. So I would say that is not normal. Maybe it was stored incorrectly.

LolWolf

14 hours ago

Do you have a write up somewhere? If not, it would be amazing if you wrote one!

I was planning on doing a similar thing (also with saliva) once I finished moving in and had a bit more time after conferences. (But, of course, I’d have to go through and actually figure out all of the mechanics and so on.)

jakobnissen

10 hours ago

I suspect the authors read the number of active pores during sequencing and then wrongly assumed that the non-active ones had a manufacturing defect.

In my experience, most inactive pores are due to a poorly prepared sample. I don't know why, but maybe it blocks or jams the pores.

When I analyzed Oxford nano pore data a few years ago, I found it to be very sensitive to skilled sample preparation. The data quality varied so much that I could tell which of my laborant co-workers (the experienced one or the new one) had prepared the sample by analyzing the data. So I expect that the authors garage sample prep maybe wasn't great.

Coincidentally, I had a colleague who worked on building a portable sequencing lab powered by a car battery. The purpose was to be able to identify viruses by DNA from a van in rural Central Africa or wherever. Last I talked to her, the technical bottleneck was sample prep - the computational part of the van lab wasn't too hard.

MillironX

15 hours ago

> Is this normal for the machine?

No, it's not "normal," but it is fairly common. When I worked in NGS, nearly 1/4 of flow cells were duds. ONT used to have a policy where you could return the cell and get a new one if it failed its self-test.

sbassi

20 hours ago

it depends of the sample. usually you have at least 1200, with a guaranteed of at least 800, so maybe he could ask for a refund.

vintermann

11 hours ago

I think it was pretty interesting in a "what would likely happen if you tried this" way. Negative results are good. A lot of technical problems is what I'd expected though, from my little experience in genetic genealogy.

refurb

16 hours ago

Like most analytical methods, the preparation of the sample is key. High quality output comes with careful sample prep so that the analytical process can run optimally.

dunk010

21 hours ago

Nebula and Dante will do this for like $300, and you can get 30x coverage at every base or even 100x coverage if you pay a little more. The $1000 genome was here more than a decade ago.

zaptheimpaler

20 hours ago

I wanted to try this, but I looked into Nebula a bit more.

Nebula is facing a class action for apparently disclosing detailed genomic data to Meta, Microsoft & Google. The subreddit is also full of reports of people who never received their results years after sending their kits back. There are also concerns about the quality of sequencing and false positives in all DTC genomics testing. Given what happened with 23andme as well and all of this stuff, I'm wary of sending my genetic data to any private company.

mquander

19 hours ago

I was interested to read this because some time ago I had my genome sequenced by Nebula. If you look at the lawsuit you can see that what Nebula did was use off-the-shelf third-party analytics products on their website, including recording analytics pings when users buy a kit, and pings when users use the Nebula website to browse Nebula's high-level analysis of their traits (leaking that the user has those traits to the analytics provider.)

This behavior represents a contemptible lack of respect for users' privacy, but it's important to distinguish it from Nebula selling access to users' genomes.

https://www.classaction.org/media/portillov-nebula-genomics-...

zaptheimpaler

19 hours ago

That's a good clarification. I read through some of that link, and it does look relatively benign - Meta & Google pixels might see when you buy a kit but nothing more, but on page 21 they directly leaked genetic information to Microsoft via their Clarity tracker. Not intentionally maybe, questionable if it can be linked to a person specifically instead of just an advertising ID but they did leak that. I think the lawsuit says that even disclosing whether a person has undergone genetic testing is in violation of GIPA, so the information they sent to all 3 is enough to violate that.

I don't have any evidence they're selling anything but that lawsuit shows pretty sloppy behaviour for a company that should be thinking very deeply about privacy. I guess that's about what you said though :)

vintermann

11 hours ago

Another point is that Wojicki's big idea that all this genetic data would be useful to sell to business, didn't work out so well. For an advertiser, it's a lot more useful to know if you're a smoker, than to know that you have a 40% higher chance of being a smoker.

busterarm

18 hours ago

The point isn't what they are doing with your data now, but that they retain your data and what might happen in the future. Someone with malicious designs on your DNA might buy Nebula tomorrow and there's nothing you can do about it.

mquander

17 hours ago

Actually, the main reason I used Nebula was that they advertised a credible-to-me promise that you could download and permanently delete your data upon request. That was some years ago, so I don't know if I would trust them today. But that was their claim, and I have no reason to believe they didn't delete my data.

vintermann

11 hours ago

That's a legal requirement in the EU and many US states. Some of the genetic genealogy companies actually play fast and loose with it though - not the deletion, which I trust, but the data portability and reasons to store PI parts.

Aurornis

20 hours ago

> There are also concerns about the quality of sequencing and false positives in all DTC genomics testing.

Even when the raw results are accurate there is a cottage industry of consultants and snake-oil sellers pushing bad science based on genetic testing results.

Outside of a few rare mutations, most people find their genetic testing results underwhelming or hard to interpret. Many of the SNPs come with mild correlations like “1.3X more likely to get this rare condition” which is extremely alarming to people who don’t understand that 1.3 times a very small number is still a very small number.

The worst are the consultants and websites that take your files and claim to interpret everything about your life or illness based on a couple SNPs. Usually it’s the famous MTHFR variants, most of which have no actual impact on your life because they’re so common. Yet there are numerous Facebook groups and subreddits telling you to spend $100 on some automated website or consultant who will tell you that your MTHFR and COMT SNPs explain everything about you and your ills, along with which supplements you need to take (through their personal branded supplement web shop or affiliate links, of course).

phyzome

20 hours ago

Yeah, the only way I would ever do DNA sequencing is anonymously...

jjallen

18 hours ago

Because of public family trees potentially linking a genome to a family, no dna is fully anonymous these days.

phyzome

5 hours ago

The DNA itself is not "anonymous", but I would do it without giving my real name, address, etc. They could know who the DNA is related to, but not gain more information than that.

Even better would be to swap identity with someone else who wants to get sequenced...

ImPostingOnHN

3 hours ago

They would be able to pinpoint your identity (e.g. "this person is the son of both X and Y, and we know who X and Y are").

stared

2 hours ago

I see 399EUR (or $466) for the cheapest variant https://www.dantelabs.com/products/whole-genome-sequencing - or do I miss anything?

otherme123

10 hours ago

Note the $2,000 bill includes the DNA extraction machinery and the sequencer itself. The sequencers that Nebula et al use are probably over 1 million $.

If you want to go even cheaper and depending of what you want, you can go for an exome instead of a WGS. And a lot of people are sequencing when they really want genotyping.

But I would not be surprised if someone is already getting $100 WGS.

freehorse

20 hours ago

Yeah but then basically somebody else gets ownership of your genetic data and gets the right to do anything with it in the context of their "legitimate interests". Not to mention to probability of that company getting hacked or sold, as it has already happened with some.

sbassi

20 hours ago

yes, the difference here is that the $1000 tag is "at-scale price". You reach that price point by running multiple sequencing with a set of reactive.

sroussey

11 hours ago

What about sequencing.com?

subroutine

19 hours ago

Does Nebula or Dante provide BAM or just VCF?

Metacelsus

19 hours ago

Both do. I got mine through Dante, my wife through Nebula.

conradev

19 hours ago

Dante includes a BAM

ramraj07

2 hours ago

Unless you are buying a machine, making sure its offline and doing it all by yourself, DONT sequence your DNA! You're not just condemning your genetic data to whoever might pay to obtain it but that of all your future progeny and even that of all your blood relatives. Its insane how bad the worst case scenario is.

literalAardvark

an hour ago

And it also doesn't help you with anything. It has near zero predictive power about your health unless combined with epigenetic data, which you can't get.

It might even make you sicker through anxiety or other nocebo.

It's only really useful as a confirmation for a doctor's diagnosis, and those are done with slightly better data controls.

savrajsingh

5 hours ago

Check out https://mynucleus.com -- we do whole-genome DNA sequencing from a cheek swab for about $500 (use code savraj10 for 10% off), so no blood drawn as in this example. We also give you your risks for over 2k diseases and, if your partner does the test, predict outcomes for your future kids. Alexis is one of our investors and we have a big announcement coming next week! We also allow download of all raw data and are SOC2 + HIPAA compliant.

cmcaleer

4 hours ago

What’s stopping a similar crisis that 23andMe customers faced where their genetic data along with their identifying information getting sold to the highest bidder if you ever become insolvent?

Considering how big a deal that was at the time, and how strong a differentiator it would be, it’s notable how absent from the homepage. It’s nice that Nucleus claims not to sell the data, but 23andMe had similar claims, it wasn’t strong enough to prevent genetic data from being transferred if they were to be acquired.

This has always been something I’ve been interested in but so far no company handles privacy concerns of data that’s so deeply fundamentally personal and private in a satisfactory way, and I’m especially apprehensive post-23andMe.

Taking a dice roll on Nucleus not just pulling another 23andMe seems not worth the ~$3000 saving you claim to be offering.

Alex3917

2 hours ago

> What’s stopping a similar crisis that 23andMe customers faced where their genetic data along with their identifying information getting sold to the highest bidder if you ever become insolvent?

Nucleus employee here. Nucleus is a medical provider that is providing a medical service and is regulated by medical laws, which extend even through bankruptcy or acquisition. Whereas 23andMe was essentially an entertainment company and was regulated as such, which is what enabled that unfortunate situation to occur.

doctorpangloss

2 hours ago

23andMe’s data isn’t a whole genome. It’s a lot less useful diagnostically, and that’s why it isn’t that valuable.

aquafox

4 hours ago

The problem for me is not getting my DNA sequenced but not having to trust a third party with my genetic information. As wirtten in the article, they only achieved a 13% coverage (even less if because you have to assume that not all base calls are correct), which is not useful for any sort of genetic analysis. So the title is really misleading.

ramraj07

2 hours ago

What coverage does your sequencing provide?

logtrees

5 hours ago

Amazing. I remember when something like this was a massive cost, if even accessible to the layperson. Now it's ~$500? Godspeed.

bn-l

4 hours ago

Do you accept monero?

NaomiLehman

2 hours ago

You're saying that having Alexis Ohanian as your investor makes you more trustworthy? :]

arjie

15 hours ago

I used Nebula (seems to be rebranded and more expensive now) for my wife and me, and for my parents and brother, and it was pretty straightforward. I paid for the 'lifetime' plan but they removed it before we did it for anyone else and it was pretty reasonable. I downloaded the FASTQ files and stuck it in an R2 bucket for myself. Nebula cost about $250 and there's a monthly $50 or something plan that's compulsory but you can cancel it right away.

If you're curious about my genome, here are my VCF files https://my.pgp-hms.org/profile/hu81A8CC

If you want to indulge your curiosity some more:

     $ rg "20189511" /Users/george/tmp/genome/nebula_roshan_NG1AW8W7PU.mm2.sortdup.bqsr.hc.vcf
     3499829:chr13 20189511 rs104894396 C T 252.77 . AC=1;AF=0.500;AN=2;BaseQRankSum=1.54;ClippingRankSum=0.00;DB;DP=25;ExcessHet=3.0103;FS=4.008;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.00;QD=10.11;ReadPosRankSum=0.666;SOR=0.160 GT:AD:DP:GQ:PL 0/1:15,10:25:99:281,0,436

Put that into an LLM or look it up here https://www.snpedia.com/index.php/Rs104894396 to find out which pathogenic mutation I am heterozygous for.

In practice, when my wife and I did carrier screening we didn't do it with Nebula, but carrier screening also confirmed that we had GJB2-related hearing loss genes in common. The embryos of our prospective children were also sequenced so that we could have a child without the condition.

Anyway, if you'd like a test file of a real human to play with, there's mine (from Nebula) for you to take a look at. If you use an LLM you can have some fun looking at this stuff (you can see I'm a man because there are chrY variants in there).

I also used Dante because I wanted to compare the results of their sequencing and variant calling. Unfortunately, they have a different way to tie the sequence back to the user (you take the code they have and keep it safe, nebula has you put the stuff in a labeled container so it's already mapped by them) and I was in a hurry with other stuff. They never responded to me with any assistance on the subject - not even to refuse the request to get the code for that address - so I have no idea how they work.

The nanopore stuff is very cool, but I heard (on Twitter) there were quality control issues with the devices. I'd love to try it some time later just to line it up with my daughter's genome.

r0ze-at-hn

6 hours ago

Something fun, you have a CYP11B1 rs4541 g;a Wouldn't surprise me if don't like Licorice. You also have something I don't see too often The CYP17A1 −34 T>C, rs743572(A;G) which compounds on that.

Depends on the sum of all the genes in this area of course, but this one mutation is a big influence on the hpa-axis. I would ask if you have lower body weight, heightened anxiety, bad acne as a teenager, episodes of dizziness upon standing, salt cravings, and difficulties with sleep this would be the main driver, pretty standard nonclassic CAH. If you had ever thought you might have "pots", the more accurate would be hypoaldosteronism (but depends on renin genes).

Sense we were poking around here is some highlights

Decent chance of being left handed or ambidexterity given that you also have PCSK6 rs11855415 a;t at the same time (as it can help with the salt issues) and I look for when I see something like the above two.

Vitamin D risk given your GG CYP2R1 (dr probably checks that yearly anyway), risk of lower Mg because of this (cramps, muscle twitches?).

bvitamin wise b9, b12 could be on the lower side given MTRR AA rs1802059 (combined with MTHFR 31 GT 76 CT, MET 30 CG, COMT 99 AG, BHMT rs3733890 G;A). Probably like spinach. If you have TMJ regularly you need to find the right diet or bcomplex for you which will fix this as well as any hypermobility resulting from the collagen production issues. Higher chance of myopia, especially if you are gen Z.

TPH2 rs4570625 g;t jumps out on the serotonin path. Vit d can help here, some might say 5-htp when depressed, but fix the vit d first. Do you like sour gummy candy?

CYP1B1, I see 3 reductions, combined (and the above) I would ask if you have glaucoma in your family history, if so then stuff you can do.

CYP1A1 rs1048943 C;T and really CYP1A2 rs762551 A;A, so fast caffeine and melatonin issues. More insomnia.

CYP2E1, need less acetaminophen to do the same as others.

Intentionally not bothering to go into why, but above average intelligence.

Combine all of the above and decent chance you fall into the bucket of being taller (6'1"?), skinnier, hard time falling asleep and also likes sleeping in, higher libido, left handed, high visual skills, geeky. Possibly synesthesia (a weaker form). Would enjoy a strategy board game over trivial pursuit. Earlier hair loss. Higher risk of one form of Alzheimer's (there is stuff you can do today to reduce it). *Do not smoke*. Didn't dive into all of the ADHD genes, but if mild resolving the above Vit D, b vitamin deficiencies would influence that.

This was with 10 minutes of poking around not a comprehensive look. Mostly I just wanted to add a comment to the general reader that genetics variants are part of larger systems. You would want to do a deeper look, combine it with symptoms as well as lab work to determine the full impact of any change. For example the PCSK6 variant reduces the impact of the CYP11B1 variant. Further you could also easily have something else on the hpa-axis that completely negates the NCAH and never have any salt issues at all. Before spending time looking through each gene I would simply ask, hey do you love to put salt on every meal?

Another one I didn't dig into, but would just ask first is if you have a big sweet tooth. (ncah influenced hypoglycemia).

Feel free to give me a ping and I can walk you though this better.

There is a reason these always end with a disclaimer, talk to your doctor about making changes to your diet, etc, I am not a doctor just someone who learned biology/genetics as a hobby especially given how it can teach tricks to apply to software engineering and my ai/AGI work.

chromatin

2 hours ago

> Intentionally not bothering to go into why, but above average intelligence.

Speaking as a geneticist, it's a shame that this is forbidden knowledge

dash2

7 hours ago

The obvious question: why are you so relaxed about revealing your whole DNA to the world?

jasongill

20 hours ago

Unfortunately, the "MinION Starter Kit" for $1000 appears to no longer be available; the link in the article to the kit goes to a 404 page, and the cheapest MinION device with flow cells is now $4950 USD

jolmg

20 hours ago

Article was posted 2 days ago...

greazy

20 hours ago

The article author probably bought the starter kit a while ago. It might explain why the pore count was low. It's a biological product so it degrades over time.

numpad0

19 hours ago

These are by no means a new product. I think the early prototypes for these possibly predate the microUSB plug.

The brochures always showed it next to a completely non-sterile laptop, but it never made sense. It's fundamentally a bio lab equipment, just small. You probably should be wiping the package with disinfectant, use DNA-cides as needed, or follow whatever bioscience people consider the basic common sense hygiene standards.

bonsai_spool

19 hours ago

> The brochures always showed it next to a completely non-sterile laptop

This can be done in the field (read near a lot of dirt). This does not require sterility at all. The main problems with this are keeping your prep clean (which is different from sterile; primarily involves not getting bubbles where they shouldn't be etc.) and temperature/salt handling.

> These are by no means a new product. I think the early prototypes for these possibly predate the microUSB plug. > You probably should be wiping the package with disinfectant, use DNA-cides as needed, or follow whatever bioscience people consider the basic common sense hygiene standards.

The consumable product is what needs to be stored carefully. Its delivered DNA-free; no disinfectant is needed. It's actually hard for accidental DNA to be introduced at the sequencing step; that would usually reflect poor practices earlier on.

greazy

20 hours ago

The thermocycler replacement using an electric kettle is hilarious. Thats how old school dna amplification would happen before the invention of thermocyclers.

OP you'd get better results of you centrifuge your blood, extract the white blood cells and sequence those instead of whole blood. Thats a bit tricky with a lance and a tiny device though...

jszymborski

3 hours ago

When I was in school in the early 2010s (maybe 17-18yo) our intro to biology course had us thermocycle by alternating between warm tubs of water with the use of an egg-timer. When I later used a thermocycler during my research career I really came to appreciate the little bugger (even though it caused a lot of headaches anyway)

ml_basics

10 hours ago

The graph at the beginning showing the cost of sequencing over time falling faster than Moore's law stops in 2015. Would love to see how things have progressed since then. Casually googling i only saw plots up to 2021 but looks to me like progress is now slower than Moore's law since ~2015. Maybe things will change when Nanopore gets more reliable

dust42

10 hours ago

The graph also only starts in 2001. I worked as a student at the EMBL (European Molecular Biology Laboratory) in the bio-physics instrumentation group in the mid 90ies. The lab group was developing prototypes of thin-film electrophoresis DNA sequencers. Pharmacia Biotech then bought some of the tech and brought it to market. AFAIR at that time it were some of the fastest sequencers but we are talking of low 100s of base pairs per day.

mfld

9 hours ago

The NHGRI updated these plots for years. Sad to see that there is no update since 2022, presumably due to lack of funding.

The sub-$100 genomes could be in reach within the next 5 years, from what I have seen.

FL33TW00D

18 hours ago

Dante and Nebula have a bad reputation. ySeq has an 8 month wait list. This guys Nanopore sequencer doesn’t work.

It is quite hard to get yourself sequenced in EU in 2025.

iafiaf

7 hours ago

These guys didn't do their homework correctly. Keeping aside commercial vendors; their own garage setup for doomed to fail with their coverage (or lack thereof). Garbage in, garbage out.

justinc8687

5 hours ago

So given the mostly negative comments in this thread about various companies, can anyone recommend a reasonably priced, reasonably quick, place to get your DNA sequenced and subsequently analyzed? I'm curious about a more general view as well as some specific mutations.

Any recommendations?

optionalsquid

19 hours ago

It's cool that nanopore technologies are getting this affordable, but keep in mind that these technologies (to my knowledge) still have very high error rates compared older sequencing techniques. Both in terms of individual nucleotides (A, C, G, and Ts) being misread, but also in terms of stretches of nucleotides being mistakenly added to or removed from the resulting sequences (indels).

So, yes, you can sequence your genome relatively cheaply using these technologies at home, but you won't be able to draw any conclusions from the results

greazy

19 hours ago

With the recent R10 flow cells the error rate has improved. The basecalling models have also been steadily improving and therefore reducing the error rate.

For assembling a bacterial genome the consensus error rate is as low or in some cases better than Illumina.

Nanopore platform has its usecases that Illumina falls short on.

> So, yes, you can sequence your genome relatively cheaply using these technologies at home, but you won't be able to draw any conclusions from the results

Agreed, any at home sequencing should not be used to draw any conclusions.

Ovah

19 hours ago

That's a prevalent misconception even in the scientific community. Sure, each read has 1% incorrect bases (0.01). But each segment of DNA is read many times over. More or less 0.01^(many times) ≈ 0 incorrect bases.

optionalsquid

19 hours ago

The author got less than 1x coverage for their efforts. To get the kind of coverage required for reliable base-calls, you need significantly higher coverage, and therefore a significantly higher spend

bonsai_spool

19 hours ago

> That's a prevalent misconception even in the scientific community. Sure, each read has 1% incorrect bases (0.01). But each segment of DNA is read many times over. More or less 0.01^(many times) ≈ 0 incorrect bases.

That's true in targeted sequencing, but when you try to sequence a whole genome, this is unlikely.

jakobnissen

10 hours ago

I worked with Nanopore data about four years ago, and I found that that's mostly true, but for some reason at some sites, there was systematic errors where more than half of reads were wrong.

I can't 100% prove it wasn't a legit mutation but our lab did several tests where we sequenced the same sample with both Illumina and Nanopore, and found Nanopore to be less than perfect even with exteme depth. Like, out depth was so high we routinely experienced overflow bugs in the assembly software because it stored the depth in a UInt16.

bonsai_spool

5 hours ago

What was the DNA source? At the same time (4 years ago) there were issues with specific species - Birds and some metagenome species were the worst if I remember correctly.

optionalsquid

18 hours ago

> That's true in targeted sequencing, but when you try to sequence a whole genome, this is unlikely.

Whole-genome shotgun sequencing is pretty cheap these days.

The person you are replying to doesn't give any specific numbers, but in my experience, you aim for 5-20x average coverage for population level studies, depending on the number of samples and what you are looking for, and 30x or higher for studies where individuals are important.

For context, coverage refers to the (average) number of resulting DNA sequences that cover a given position in the target genome. Though there is of course variation in local coverage, regardless of your average coverage, and that can result in individual base-calls being being more or less reliable

bonsai_spool

18 hours ago

I’m referring to the experiment done in the OP - the most I’ve read about from an minION flow cell is 8 Gb (and this is from cell line preps with tons of DNA, so the coverage isn’t great).

You need multiple flow cells or a higher capacity flow cell to get anything close to 1X on an unselected genome prep.

Shotgun sequencing isn’t probably what you meant to say - this is all enzymatic or, if it’s sonicated, gets size selected.

optionalsquid

18 hours ago

What the person you replied to described read like short read sequencing with PCR amplification to me ("each segment of DNA is read many times over"), rather than nanopore sequencing. My reply to you was written based on that (possibly false) assumption.

But if we are talking nanopore sequencing, then yes, you need multiple flowcells. Which is not a problem if you are not a private person attempting to sequence your own genome on the cheap

Ovah

2 hours ago

My comment was with NanoPore in mind.

bonsai_spool

16 hours ago

There wasn’t enough information to tell (on my 1 minute scan) which nanopore kit was used, but the presence of PCR does not imply short reads.

You can do nanopore PCR/cDNA workflows right up to the largest known mRNAs (13kb).

Edit:

I’m not sure if you’re saying that you can’t do a 5/20/30X genome on nanopore - that’s also not true. It only makes sense in particular research settings, of course.

kyriakos

13 hours ago

What is the practical use of having your dna sequenced?

cmrx64

12 hours ago

it revealed I am comparatively insensitive to Ritalin and guided ADHD medication choices

IceHegel

20 hours ago

Who can do this with good data controls? I don't want to have to dig through the fine print of some Terms of Service page to figure out if a sequencing company is going to save a copy of my genetic code for possible future use.

greazy

19 hours ago

I sequences my genome about 10 year's ago using illumina platform for ~1200AUD. We used a university sequencing facility. They were happy to extract and sequence the dna using a shotgun approach. Depth was 5x and I think we achieved about 90% coverage. It was just for fun.

The issue with this approach is that you'll receive raw data that needs to be processed. Even after processing you'll need to do further analysis to answer your questions. After all this, I'd be suspicious of the results and seek a medical councellor to discuss and perform further tests.

I'd advise on thinking what questions you want answered. 'Sequencing your genome' sounds amazing but imo you're better off with seeking accredited tests with acrionable results.

otherme123

10 hours ago

The really difficult part of sequencing is after the Vcf. Before that is trivial, you can plug your fastq raw data in some Nextflow pipelines and in a couple of days of computing get plenty of "results" to be busy exploring for years.

coppa

20 hours ago

Speaking of which I would advise : Svante Pääbo Neanderthal Man: In Search of Lost Genomes then even better imho The Naked Neanderthal by Ludovic Slimak. After these books I spent many hours listening to the full courses of Jean-Jacques Hublin, chaire Paléoanthropologie in college de France ( in french but probably translatable now with automatic features ?). This was an unexpected and wonderful path.

nashashmi

19 hours ago

If I have my genome dna data, where can I get it analyzed? For ancestry? For health info? Etc. of course With privacy!

Real_S

17 hours ago

Take a look at Monadic DNA:

https://monadicdna.com/

They are building Fully Homomorphic Encryption (FHE) and Multiparty Computation (MPC) tools for genetic data. Your data format may need to be modified. They currently focus on the SNP results from places like Ancestry.

Some HN posts from their CEO:

https://news.ycombinator.com/submitted?id=vishakh82

isbvhodnvemrwvn

19 hours ago

Forget any use for ancestry with privacy guarantees. All you'll get is magic "ethnicity" percentages, kind of astrology of genealogy. For it to be useful in genealogy context you need to rely on matching and analyzing common ancestors, this will inherently lead to your data being shared in one way or another and possibly your identity being revealed.

> 200 µL of blood (about ⅕ of a ml)

"About"? Anyway, thanks for the clarification.

NuclearPM

20 hours ago

Maybe the “about” was supposed to cover the 200 µL as well.