Yes; but in the case of a "common misconception" like this, there's always also a nontrivial minority who do know the "right answer"[1] — and so enough examples of that occur in the training data to enable the model to embed the knowledge-web of "right ideas" (as a niche activation), alongside the "wrong idea" (in its default-mode network).
The "initial fine-tuning of a 'raw' base model to produce a 'generalized pre-trained' base model" process, is commonly talked about in terms of "alignment" — making the model ethical, making it not swear at you, making it refuse to engage with certain content, etc. And really, that part is all optional, with there existing "non-aligned" or "orthogonalized" models that don't have these steps performed on them or have had them reversed, but which are still useful.
But a large part of this initial fine-tuning process, consists of debiasing the model's default activation, moving it away from making associations with "common misconceptions" and toward making associations with "right answers." And this process is crucial to a model being able to reason intelligently — as these common misconceptions aren't coherent in a chain of reasoning they appear in, and so lead to the chain of reasoning falling apart / being non-productive. This is, in large part, the "secret sauce" that makes a model of a given size "more intelligent" than another model of the same size.
Every base model that anyone actually cares about or uses — "aligned" or not — has had some process of de-biasing like this applied to it; or, at least, has short-cutted this process by training on a training dataset generated by or filtered by a model that has already had this de-biasing applied to it, such that the derived training dataset doesn't contain the "common misconceptions" in the first place.
And when OpenAI and Meta brag about using RLHF, a large part of what they mean, is crowdsourcing recognition of long-tail "common misconceptions" at scale, to allow a much more thorough version of this de-biasing process[2].
---
[1] Of course, if nobody in the training data ever demonstrates the "right answer" knowledge/associations, then the model will never learn that knowledge/associations. But then, given that these training datasets usually represent decent samples of the population, a "right answer" being missing entirely would likely mean that nobody on Earth knows the "right answer" — so we humans wouldn't be able to recognize the model was wrong. The stock market can be wrong too, for the same reason.
[2] Which, perhaps surprisingly, can add up to more than the sum of its parts. The more of this human-labelled "common misconception"-response RLHF data you have, the more you can derive patterns from this bias data. You can distil out negative examples that you can use to prompt a model to filter the training dataset; but more interestingly, you can distil positive examples of the sort of structured chains of reasoning that work best to inherently avoid triggering the bias. Where, if you can overlap the activation-space hyperspheres of many such inversion-of-bias examples, then you get, essentially, the hypersphere within the model's activation space that contains its instrumental rationality. You can then just bias the model toward living in that part of activation space as much as possible — and this shoots its apparent reasoning capacity way up.