The Alchemy of Overdiagnosis (Part 3)

A 47 year-old female has a screen-detected invasive ductal carcinoma, 1.0cm, Grade 2, node negative (Luminal B if we delve a little deeper). She undergoes breast conservation and radiation therapy. Was this cancer overdiagnosed?

Of course, we don’t know, but there are 3 possible scenarios regarding her tumor biology:

Biologic A – her tumor was local at the time of mammographic discovery and would still be local in a year or two when it became palpable. Thus, screening and “early diagnosis” was of no benefit in the long run. She will be cured whether the diagnosis is early or later on.

Biologic B – her tumor was already systemic at the time of mammographic discovery, so screening was of no benefit, and “early diagnosis” was only an illusion.

Note: The original Fisher Theory of breast cancer biology ends here with only two options – Biologic A and Biologic B. That said, Fisher Theory was developed prior to screening mammography, back when diagnosis occurred at a single point in time upon tumor palpation. Early detection through asymptomatic screening converted that single point into a window of opportunity by gaining a jump on the natural history of the disease. The success of the mammographic screening trials introduced a third scenario that plays out in the minority of patients, although enough to achieve a measurable mortality reduction, that is:

Goldilocks Biology – her tumor is local at the time of mammographic discovery, but its natural history is such that, if left in the breast until it becomes palpable, will become systemic. This is the only scenario where screening benefits the patient. Even with Goldilocks biology, the screening must occur at the right point in time.

Some are astounded to think that only the minority of patients have Goldilocks biology. Stated alternatively, most so-called “early detections” don’t save a life. This is not a serious point of contention among screening epidemiologists (although there are wide variations in quantification). If Goldilocks biology dominated, then the mortality reduction with mammographic screening would be far greater than what we actually see.

As for Biologic A, this is the fodder for anti-screening epidemiologists and guardians of public health who have used the alchemy of indirect reasoning and a warehouse of assumptions to convert this well-known group of patients, perhaps one-third of mammographic discoveries, into “Overdiagnosis.” The distinction is this: Biologic A patients will eventually progress (if they live long enough) and will still be diagnosed and treated for breast cancer without screening.

Not true in Overdiagnosis. With overdiagnosis, the Biologic A tumors never progress. Rather than the length bias of slow-growing tumors, overdiagnosed cancers are “pseudocancers” that would never harm the patient if it weren’t for that darned screening obsession in countries like the U.S. Dr. H. Gilbert Welch makes the claim that this is the case for 70,000 women every year in the U.S., such that more than a million women are walking around today thinking they’ve been cured of their cancer when, in fact, they only had pseudocancers in the first place. (The only pseudocancers, I would submit, are actually misdiagnoses rather than overdiagnosis, e.g. the adenosis family of lesions and radial scars/CSLs, no small problem…but I’ll save that for another day.)

Now, returning to our 47 year-old newly diagnosed patient, let’s assign her to Goldilocks biology and an optimally-time mammogram that found the lesion while still local in the breast. This patient is one of the lucky minority whose life is indeed saved by screening mammography. On the final day of her radiation treatment, however, she walks out of the hospital and is struck by the proverbial bus and dies on the spot. What do the strange brews and concoctions of Big Data do with this patient? She will appear in the column of Overdiagnosis.

This scenario is obviously not going to explain 70,000 cases a year of pseudocancers, but the point is this: Life expectancy allows for gross manipulation in this debate, while Big Data has severe shortcomings. Amazingly, the SEER data is used routinely and frequently to endorse and solidify overdiagnosis, in spite of the fact that “method of detection” is not part of the SEER data. How can you draw reliable conclusions about screening mammography when you have no idea about mammographic utilization, compliance or method of detection?

The SEER data is often used to explore the biology of small cancers, pointing out that aggressive tumors occur as interval cancers. This is where the assumptions kick in, with smaller tumors “assumed” to have been discovered by mammography. It is true that there is a modest trend toward lower grade cancers detected by screening and higher grade cancers appearing during screening intervals. Thus, we have length bias that is very real, but this is still overpowered by the mortality reductions seen in the prospective, randomized screening trials.

That said, the difference between the biology of screen-detected and interval cancers is not all that it’s made out to be. The Overdiagnosis Mob tries to convince us that interval cancers are the only killers and that they pop up in between screens. I’ve spent a great deal of time over the years reviewing the interval cancer data (and there’s a lot of it), and it’s not exactly what many claim. The majority of interval tumors have nothing to do with aggressiveness. The reason that they “pop up” in between screens is that they were “garden variety” breast cancers buried in the white patches of mammography, finally emerging as palpable. They were present on the prior mammogram and large enough to be detected had they interfaced with fatty tissue. But by virtue of their location in a dense patch, were not visible. They would have been easily detectable, however, had ultrasound or MRI been used. MOST interval cancers are mammographic failures that have nothing to do with the inherent biology of the cancer.

For many years, this was conjecture supported by reasonably good evidence, but Dr. Christiane Kuhl put the argument to rest with her landmark study of screening MRI in a normal risk population. With the reliable detection of MRI, guess how many cancers were left to “pop up” in the interval between screenings? ZERO. That’s right. The dreaded interval cancer disappeared by simply using a tool that detects cancer more reliably than mammography. Again, the majority of interval cancers are mammographic misses that have nothing to do with biology. Even the high-risk MRI screening trials, where interval cancers should have been in the 30-40% range saw, with a few exceptions, single-digit rates of interval cancers.

And what is the response to MRI (and other modalities) by the Overdiagnosis Cabal? Here’s the reason why need to make the distinction between Biologic A tumors with length bias vs. true overdiagnosis — the Overdiagnosis Squad condemns improvements in screening with multi-modality imaging because “you will only make the overdiagnosis problem worse!” And there you have it, in print, by anti-screeners as well as the U.S. Preventive Services Task Force – “you will only make the overdiagnosis problem worse.”

Definition, please. “Overdiagnosis” is the detection of disease that would NEVER cause symptoms or death during a patient’s expected lifetime. Aye, there’s the rub – expected lifetime. This is where strange brews and concoctions are used to convert indolent cancers into pseudocancers.

In the epidemiology of cancer screening, overdiagnosis is a given. It’s only a matter of quantifying the degree. And if your entire career is devoted to the anti-screening position, such quantification can run amuck. And one of the quickest ways for the overdiagnosis rate to escalate is by patient deaths prior to expectations. How much of overdiagnosis is related to biology and how much is due to premature death of the patient?

This entire debate, however, would be moot if it weren’t for the evil cousin of overdiagnosis, that is, overtreatment. Who cares what you name it? We are overtreating too many patients.

That’s a legitimate point, and finally, after many years of neglect, we are seeing a huge push in the direction of limiting overtreatment. But the reason I am often railing against the Overdiagnosis Club is the fact that overdiagnosis (with presumed overtreatment) is being used to denigrate screening – not simply mammographic screening with all its known warts, but the technologies that allow us to find the cancers currently missed by mammography – ultrasound, PET, MBI, contrast-enhanced mammography and MRI. All of them are under attack for their potential for making the “overdiagnosis crisis even worse than it is already.”

The focus on overdiagnosis of invasive breast cancer has risen among public health guardians from an obscure sect 10 years ago into the dominant religion of the day for many. Belief in the unseen. Faith in the miraculous regression of invasive cancers (70,000 times a year) if we would only leave them alone and keep them out of the hands of breast radiologists and their clinical parasites.

But it is not a religion of passivity. The Overdiagnosis Cult (my favorite moniker) attacks from all angles. If you say that they have failed to correct for lead time at the end of the randomized screening trials (which tends to make overdiagnosis disappear), they will point out that this criticism is based on the false assumption that all tumors progress. If you say they are not accounting for patient deaths before the natural history plays out, they will show you the overdiagnosis rate they have calculated for women in their 40s who are screened and have plenty of time left for the natural history to emerge.

But there’s an odd paradox to their impressive arguments – the more that the Overdiagnosis Cult insists that these lesions NEVER progress, the more they are backed into the corner of having to explain why – unlike prostate cancer – we can’t find the direct evidence for overdiagnosis in breast cancer. Autopsy data doesn’t support it. And now, the regression study by Dr. Sickles’group doesn’t support it. A true believer, however, is never backed into a corner — Dr. Welch begins his presentation by explaining there is no possible way to generate direct evidence for overdiagnosis since these tumors are removed after discovery. Thus, like “black holes” in science, we must use indirect methodology.

As we’ve seen in Part 1 and Part 2, however, this claim of “indirect only” is not entirely true. We get to directly observe what would happen if overdiagnosis of invasive breast cancer were as common as Dr. Welch and others have claimed. We would see either 1) tumor quiescence, or 2) tumor regression.

Tumor quiescence has been knocked out of the saddle by Dr. Welch’s own autopsy data from 20 years ago when his target was merely DCIS (he showed invasive cancers present in autopsy series only 1.3% of the time, the same as disease prevalence in the living, effectively ruling out quiescence as an explanation). Then, with the recent study by Society of Breast Imaging, we know that tumor regression doesn’t happen either.

Okay, how about a 0.7cm tubular cancer found with screening mammography in a 78-year-old with co-morbidities? From a practical standpoint, it doesn’t matter whether you call it never-progressing or slowly-progressing. She should have a wide excision and nothing else, perhaps endocrine therapy. If we did that as our routine, we would not be suffering the attacks of the Overdiagnosis Cult to the same degree (and the same can be said for other presentations as well).

In the case of this 78-year-old, we have life expectancy supervening. But what is the natural history of the small, pure tubular cancer? Can you remember the last time you treated a 3.0cm pure tubular cancer? Probably not. Why are tubular cancers always small? And, as a corollary, why are they usually mammographic discoveries? Certainly, overdiagnosis applies here, you say. Maybe so, but only when life expectancy is part of the definition. So, yes, this 78 year-old would likely have been fine without screening, but what if she lives another 20 years? If tubular cancers don’t progress and don’t regress, then they should be identified frequently at autopsy. But they are not.

As it turns out, historical literature is available, generated in the early days of mammographic screening when tubular cancers started to emerge on a regular basis. What pathologists found was a direct correlation between size and “purity.” That is, as tumor diameter increased, less and less of the tumor surface area had strictly tubular features. Only remnants of tubular persisted above 2.0cm. This was judged from many cases, simply by recording percentage tubular vs. percentage non-tubular (or “garden variety” IDC) plotted against size. The larger the tumor, the less percentage was tubular. I can only think of one explanation – tubular cancers de-differentiate as they grow. It may take a long time, but they are not stagnant and thus are not overdiagnosed in the strictest sense. They are progressing. The most indolent invasive cancer known to exist progresses very slowly, not merely by an increase in size, but by additional mutations that allow the cancer to become “more malignant” over time (that is, the basic scientists’ definition of “progression.” as in “initiation, promotion, progression,” the 3 steps of carcinogenesis).

The Overdiagnosis Cult are iconoclasts, and they revel in their position, authoring articles and books, generating headlines based on “man bites dog” while the breast radiologists of the world go about their business saving lives with mammographic screening while suffering the beatings endured by constant media attacks.

There has been an enormous shift these past 10 years toward anti-screening sentiment, linked to the false belief that systemic therapies have advanced to the point where we can de-escalate screening efforts. It is true that once a “cure” is established for breast cancer, we will no longer need to screen. But the call to cut back on screening is premature. Way too many women are still dying of breast cancer, and if you understand the unique power of the “reverse randomized trial” from Mass General (Webb ML, Cady B, Michaelson JS, et al. Cancer 2014; 120:2839-2846) then most of those deaths (71%) occur in unscreened women.

I am not in favor of the status quo. My position is exactly opposite the anti-screeners – that is, we should be doing more. More, that is, in properly selected patients. The sensitivity of mammography has been overstated for years. Now that we have multi-modality imaging studies, we know the truth. And the truth is that the mortality reductions we see in the historic clinical trials of screening mammography are relatively modest only because detectable cancers have gone undetected. Goldilocks biology doesn’t help if the cancer is hidden by mammographic density. We ought to be doing everything possible to find those cancers. The mortality reduction through early detection (not limited to mammography) is actually greater than we currently believe because we are using data from obsolete technology that probably missed half of what we could detect today. Yet, we are told to “back off” on screening?

Sadly, we have the technology to find nearly all breast cancers when still small and node-negative, including the aggressive ones. Biologic B cancers, of course, will still be in the mix, so the mortality can never be reduced to zero with early detection. But instead of capitalizing on this incredible technology, we are being told that we will do more harm than good.

If you don’t think this is true, then read the fine print in the 2015 report from the U.S. Preventive Services Task Force. For women with dense breasts, the Task Force has issued a Grade “I” (Insufficient Evidence) when it comes to dense breasts – that is, in their own words: “The USPSTF concludes that the current evidence is insufficient to assess the balance of benefits and harms of adjunctive screening for breast cancer using breast ultrasound, magnetic resonance imaging (MRI), tomosynthesis, or other modalities in women identified to have dense breasts on an otherwise negative screening mammogram.” The Task Force goes on to explain their concern about the biggest harm of them all – overdiagnosis.

Now, if you happen to have Level D breast density, then the sensitivity for cancer detection with 2-D digital mammography is somewhere around 30-40%. Then, if you rank order sensitivity with available technologies (US, PET, MBI, MRI, 3D tomosynthesis, Contrast-enhanced Mammography), then the Task Force recommendation – mammography – comes in dead last. As a result, the Task Force is exclusively endorsing the worst possible imaging modality for breast cancer detection in these women.

They are not ignorant of this fact, but they refuse to budge from their no-win situation of relying entirely on the results from the mortality reduction endpoint in prospective-randomized trials (which can’t possibly keep up with developing technology). Of course, there are no prospective trials with a mortality endpoint for these modalities, so for the slaves to RCTs who refuse to yield to the quiet voice of rational thought, they are stuck in the technology of the 1970s and 1980s.

The Task Force sounds semi-apologetic about their plight, noting wistfully that it does appear that the other modalities increase cancer detection rates through improved sensitivity, but then comes their own chilling retort as they hold themselves in check – unfortunately, overdiagnosis has a direct and unavoidable relationship to sensitivity. If you improve the sensitivity of cancer detection, then the overdiagnosis problem is made worse. Consider the twisted mindset here –   the fear of overdiagnosis is greater than the benefit of early detection (this, in spite of the fact that we already know that lives are saved, that is, the benefit outweighs the risk).

The endpoint of this warped logic is what should be terrifying to everyone – we must stop overdiagnosis in its tracks, and if stopping screening entirely is the only way to do it, then so be it. And if you think I’m being my usual hyperbolic self – witness the Swiss Medical Board, who in 2014, issued that very decree, recommending the cessation of screening mammography in Switzerland, given that harms outweigh benefit. And the greatest of those harms is overdiagnosis.

And that is why I rail against the inflated impact of overdiagnois in breast cancer screening. Are we really going to let women die of breast cancer in the name of ethereal pseudocancers that no one can see or feel or document? So those of us who use breast MRI for screening and see clear-cut tumor downstaging (and incredibly low mortality rates emerging) are supposed to throw our hands up in the air and say, “Well, we must consider the alternative explanation – even though we can’t prove it – our outcomes are good only because we are diagnosing pseudocancers. And we should accept this as truth over the other straightforward explanation that early detection saves lives, even in our population of high risk women who have watched their relatives die of the disease, which is, in fact, why they come to us for screening MRI in the first place.”

I’m not endorsing mass multi-modality screening. Currently, we use risk levels to identify candidates for MRI screening, and we use density levels to select for ultrasound screening. Sounds good, but cost-effectiveness is still an issue, even though this is what some would term “precision medicine.” Cancer Detection Rates (CDRs) with multimodality imaging have impressive numbers for the prevalence screen, but then over time, as one slips into routine incidence screens, the cancer detection rates fall to a level that is harder to justify.

Why did we jump to an MRI policy of annual or nothing? Shouldn’t we study biennial or triennial screening for different levels of risk? Shouldn’t we be looking at strategies for post-mammography testing that could select patients for US or MRI or whatever, based on blood testing? (then the auxiliary imaging is only done with a positive blood test). This could open up the possibility of MRI for all women, independent of risk levels, yet not “automatically” performing MRI on a routine basis. Potentially, a positive blood test would convert the situation from screening to diagnostic. Instead of cancer detection rates plummeting as one moves to incidence screens with multi-modality approaches, there would be no scheduled, routine adjunct screen – and CDRs would remain high, if performing multimodality imaging only when the post-mammography test were positive.

We have barely tested the power of early diagnosis of breast cancer. Screening mammography has only teased us with its potential, leaving many cancers behind for early detection. But we’ll never find out the truth, if the policy-makers and Overdiagnosis Cult have their way – placing the fear of overdiagnosis in greater priority than saving lives.