Sunday, August 21, 2016

Response to New Atlantis piece about "Saving Science (from itself)"

So I came across this New Atlantis piece by Daniel Sarewitz, a long, rambling essay on how to save science. From what? From itself, apparently. And here's the solution:
To save the enterprise, scientists must come out of the lab and into the real world.
To expand upon this briefly, Sarewitz claims that many ills that befall our current scientific enterprise, and that these ills all stem from this "lie" from Vannevar Bush:
Scientific progress on a broad front results from the free play of free intellects, working on subjects of their own choice, in the manner dictated by their curiosity for exploration of the unknown.
The argument is that scientists, left to our own devices and untethered to practical applications (and unaccountable to the public), will drift aimlessly and never produce anything of merit to society. Moreover, the science itself will suffer from the careerism of scientists when divorced from "reality". Finally, he advocates that scientists, in order the avoid these ills, should be brought into direct relationship with outside influences. He makes his case using a set of stories touching on virtually every aspect of science today, from the "reproducibility crisis" to careerism to poor quality clinical studies to complexity to big data to model organisms—indeed, it is hard to find an issue with science that he does not ascribe to the lack of scientists focusing on practical, technology oriented research.

Here's my overall take. Yes, science has issues. Yes, there's plenty we can do to fix it. Yes, applied science is great. No, this article most definitely does not make a strong case for Sarewitz's prescription that we all do applied science while being held accountable to non-scientists.

Indeed, at a primary level, Sarewitz's essay suffers from the exact same problem that he says much modern science suffers from. At the heart of this is the distinction between science and "trans-science", the latter of which basically means "complex systems". Here's an example from the essay:
For Weinberg, who wanted to advance the case for civilian nuclear power, calculating the probability of a catastrophic nuclear reactor accident was a prime example of a trans-scientific problem. “Because the probability is so small, there is no practical possibility of determining this failure rate directly — i.e., by building, let us say, 1,000 reactors, operating them for 10,000 years and tabulating their operating histories.” Instead of science, we are left with a mélange of science, engineering, values, assumptions, and ideology. Thus, as Weinberg explains, trans-scientific debate “inevitably weaves back and forth across the boundary between what is and what is not known and knowable.” More than forty years — and three major reactor accidents — later, scientists and advocates, fully armed with data and research results, continue to debate the risks and promise of nuclear power.
As a concept, I rather like this concept of trans-science, and there are many parts of science, especially biomedical science, in which belief and narrative play bigger roles than we would like. This is true, I think, in any study of complex systems—including, for example, the study of science itself! Sarewitz's essay is riddled with narratives and implicit beliefs overriding fact, connecting dots of his choosing to suit his particular thesis and ignoring evidence to the contrary.

Sarewitz supports his argument with the following:
  1. The model of support from Department of Defense (DOD), which is strongly bound to outcomes, provides more tangible benefits.
  2. Cancer biology has largely failed to deliver cures for cancer.
  3. Patient advocates can play a role in pushing science forward by holding scientists accountable.
  4. A PhD student made a low-cost diagnostic based inspired by his experiences in the Peace Corps.
A full line-by-line rundown of the issues here would simply take more time than it's worth (indeed, I've already spent much more time on this than it's worth!), but in general, the major flaw in this piece is in attempting to draw clean narrative lines when the reality is a much more murky web of blind hope, false starts, and hard won incremental truths. In particular, we as humans tend to ascribe progress to a few heroes in a three act play, when the truth is that the groundwork of success is a rich network of connections with no end in sight. In fact, true successes as so rare and the network underlying them so complex that it's relatively easy to spin the reasons for their success in any way you want.

Let me give a few examples from the essay here. Given that I am most familiar with biomedical research (and that biomedical research seems to be Sarewitz's most prominent target), I'll stick with that.

First, Sarewitz spills much ink in extolling the virtues of the DOD results-based model. And sure, look, DOD clearly has an amazing track record of funding science projects that transform society—that much is not in dispute. (That is their explicit goal, and so it is perhaps unsurprising that they have many such prominent successes.) In the biomedical sciences, however, there is little evidence that the DOD style of research produces benefits. In the entire essay, there is exactly one example given, that of Herceptin:
DOD’s can-do approach, its enthusiasm about partnering with patient-advocates, and its dedication to solving the problem of breast cancer — rather than simply advancing our scientific understanding of the disease — won Visco over. And it didn’t take long for benefits to appear. During its first round of grantmaking in 1993–94, the program funded research on a new, biologically based targeted breast cancer therapy — a project that had already been turned down multiple times by NIH’s peer-review system because the conventional wisdom was that targeted therapies wouldn’t work. The DOD-funded studies led directly to the development of the drug Herceptin, one of the most important advances in breast cancer treatment in recent decades.
This is blatantly deceptive. I get that people love the "maverick", and the clear insinuation here is that DOD played that role, who together with patient advocates upended all the status-quo eggheads at NIH to Get Real Results. Nice story, but false. A quick look at Genetech's Herceptin timeline shows that much of the key results were in place well before 1993—in fact, they started a clinical trial in 1992! Plus, look at the timeline more closely, and you will see many seminal, basic science discoveries that laid the groundwork for Herceptin's eventual discovery. Were any of these discoveries made with a mandate from above to "Cure breast cancer by 1997 or bust"?

Overall, though, it is true that cancer treatment has not made remotely the progress we had hoped for. Why? Perhaps somewhat because of lack of imagination, but I think it's also just a really hard problem. And I get that patient advocates are frustrated by the lack of progress. Sorry, but wishing for a cure isn't going to make it happen. In the end, progress in technical areas is going to require people with technical expertise. Sarewitz devotes much of his article to the efforts of Fran Visco, a lawyer who got breast cancer and became a patient advocate, demanding a seat at the table for granting decisions. Again, it makes a nice story for a lawyer with breast cancer to turn breast cancer research on its head. I ask: would she take legal advice from a cancer biologist? Probably not. Here's a passage about Visco:
It seemed to her that creativity was being stifled as researchers displayed “a lemming effect,” chasing abundant research dollars as they rushed from one hot but ultimately fruitless topic to another. “We got tired of seeing so many people build their careers around one gene or one protein,” she says. Visco has a scientist’s understanding of the extraordinary complexity of breast cancer and the difficulties of making progress toward a cure. But when it got to the point where NBCC had helped bring $2 billion to the DOD program, she started asking: “And what? And what is there to show? You want to do this science and what?”
“At some point,” Visco says, “you really have to save a life.”
There is some truth to the fact that scientists chase career, fame and fortune. So they are human, so what? Trust me, if I knew exactly how to cure cancer for real, I would definitely be doing it right now. It's not for a lack of desire. Sometimes that's just science—real, hard science. Money won't necessarily change that reality just because of the number of zeros behind the dollar sign.

Note: I have talked with patient advocates before, and many of them are incredibly smart and knowledgeable and can be invaluable in the search for cures. But I think it's a big and unfounded leap to say that they would know how best to steer the research enterprise.

Along those lines, I think it's unfair to judge the biomedical enterprise solely by cancer research. Cancer is in many ways an easy target: huge funding, limited (though non-negligible) practical impact, fair amount of low quality research (sorry, but it's true). But there are many examples of success in biomedical science as well, including cancer. Consider HIV, which has been transformed from a death sentence into a more manageable disease. Or Gleevec. Or whatever. Many of which had no DOD involvement. And most of which relied on decades of blue skies research in molecular biology. Sure, out-of-the-box ideas have trouble gaining traction—the reasons for that should be obvious to anyone. That said, even our current system tolerates these: now fashionable ideas like immunotherapy for cancer did manage to subsist for decades even when nobody was interested.

Oh, and to the point about the PhD student's low-cost diagnostic: I of course wish him luck, but if I had a dollar for every press release on a low-cost diagnostic developed in the lab, I'd have, well, a lot of dollars. :) And seriously, there's lots of research going on in this and related areas, and certainly not all of it is from DOD-style entities. Again, I would hardly take this anecdote as a rationale for structurally changing the entire biomedical enterprise.

Anyway, to sum up, my point is that a more fair reading of the situation makes it clear Sarewitz's arguments are essentially just opinion, with little if any concrete evidence to back up his assertions that curiosity-driven research is going to destroy science from within.

Epilogue:

OK, so having spent a couple hours writing this, I'm definitely wondering why I bothered spending the time. I think most scientists would already find most of Sarewitz's piece wrong for many of the same reasons I did, and I doubt I'll convince him or his editors of anything, given their responses to my tweets:


I'm not familiar with The New Atlantis, and I don't know if they are some sort of scientific Fox News equivalent or what. I definitely get the feeling that this is some sort of icky political agenda thing. Still, if anyone reads this, my hope is that it may play some role in helping those outside science realize that science is just as hard and messy as their lives and work are, but that we're working on it and trying the best we can. And most of us do so with integrity, humility, and with a real desire to advance humanity.

Update, 8/21/2016: Okay, now I'm feeling really stupid. This New Atlantis is some sort of scientific Fox News: they're supported/published by the Ethics and Public Policy Center, which is clearly some conservative "think" tank. Sigh. Bait taken.

Monday, July 18, 2016

Honesty, integrity, academia, industry

[Note added 7/22/2016 below in response to comments]

Implicit in my last post about reputation in science was one major assumption: that honesty and integrity are important in academia. The reason I left this implicit is because it seems so utterly obvious to us in academia, given that the truth is in many ways our only real currency. In industry, there are many other forms of currency, including (but not limited to) actual currency. And thus, while we value truth first and foremost in academia, I think that in some areas of industry, even those perhaps closely related to academia, the truth is just one of many factors to weigh in their final analysis. This leads to what I consider to be some fairly disturbing decision making.

It’s sort of funny: many very talented scientists I know have left academia because they feel like in industry, you’re doing something that is real and that really matters, instead of just publishing obscure papers that nobody reads. And in the end, it's true: if you buy an iPhone, it either works or doesn’t work, and it’s not really a debatable point most of the time. And I think most CEOs of very successful companies (that actually make real things that work) are people with a lot of integrity. Indeed, one of the main questions in the Theranos story is how it could have gotten so far with a product that clearly had a lot of issues that they didn’t admit to. Is Theranos the rare anomaly? Or are there a lot more Elizabeth Holmes’s out there, flying under the radar with a lower profile? Based on what I’ve heard, I’m guessing it’s the latter, and the very notion that industry cares about the bottom line of what works or doesn’t has a lot of holes in it.

Take the example of a small startup company looking for venture capital funding. Do the venture capitalists necessarily care about the truth of the product the company is selling or the integrity of the person selling it? To me, from academia, I thought this would seem to be of paramount importance. However, from what I’ve been hearing, turns out I was completely wrong. Take one case I’ve heard of where (to paraphrase) someone I know was asked by venture capitalists at some big firm or another to comment on someone they were considering funding. This person then related some serious integrity issues with this person to the venture capitalists. To which the venture people said something like “We hear what you’re saying. Thing is, I gotta say, a lot of people we look at make up their degrees and stuff like that. We just don’t really care.” A lot of people make up their degrees, and we just don’t really care. A number of other people I know have told me versions of the same thing: they call the venture capitalists (or the venture capitalists even call them), they raise their concerns, and the venture people just don’t want to hear it.

Let’s logic this out a bit. The question is why venture capitalists don’t care whether the people they fund are liars. Let’s take as a given that the venture capitalists are not idiots. One possible reason that they may not care is that it’s not worth their time to find out whether someone has faked their credentials. Well, given that the funding is often in the millions and it probably takes an underling half a day with Google and a telephone to verify someone’s credentials, I think that’s unlikely to be the issue (plus, it seems that even when lies are brought to their attention, they just don’t care). So now we are left with venture capitalists knowingly funding unscrupulous people. From here, there are a few possibilities. One is that someone could be a fraud personally but still build a successful business in the long term. Loathe as I am to admit it, this is entirely possible—I haven’t run a business, and as I pointed out in the last post, there are definitely people in science who are pretty widely acknowledged as doing shoddy work, and yet it doesn’t (always) seem to stick. Moreover, there was the former dean of college admissions at MIT, who appeared to be rather successful at her job until it came out that (you can’t make this stuff up) that she faked her college degrees. I do think, however, that the probability of a fraudulent person doing something real and meaningful in the world is probably considerably less than the infamous 1 out of 10 ratio of success to failure that venture people always bandy about, or at least considerably less than someone who's not a Faker McFakerpants. Plus, as the MIT example shows, there’s always the risk that someone finds out about it, and it leads to a high-profile debacle. Imagine if Elizabeth Holmes said that she actually graduated from Stanford (instead of admitting to dropping out (worn as a badge of honor?)). Would there be any chance she would have taken her scam this far without someone blowing the whistle? Overall, I think there’s a substantial long term risk in funding liars and cheats (duh?).

Another possibility, though, is that venture capitalists will fund people who are liars and cheats because they don’t care about building a viable long term business. All they care about is pumping the business up and selling it off to the next bidder. Perhaps the venture capitalists will invest in a charming con-artist because someone not, ahem, constrained by the details of reality might be a really good salesman. I don’t know, but the cynic in me says that this may be the answer more often than not. One might say, well, whatever, who cares if some Silicon Valley billionaires lose a couple million dollars. Problem is, implicit in this possibility is that somebody is losing out, most likely some other investors along the way. Just as bad, rewarding cheaters erodes everyone’s sense of trust in the system. This is particularly aggravating in cases when the company is couched in moral or ethical terms—and in situations where patient health is involved, everything suddenly becomes that much more serious still.

Overall, one eye-opening aspect of all this for me as an academic is that while we value integrity, skepticism and evidence very highly, business values things like “passion” more than we do. I don’t know that an imposition of academic values would have necessarily caught something like Theranos earlier on and all the other lesser known cases out there, but I would like to think that it would. Why are these values not universal, though? After all, our role in academia is that of evaluation, of setting a bar that employers value. In a way, our student’s aren’t really paying for an education per se—rather, they are paying for our evaluation, which is a credential that will get them a job; in a sense, it’s their future employers that are paying for the degree. Why doesn’t this work when someone fakes a degree? When someone fakes data?

Here’s a thought. One way to counter the strategy of funding fakers and frauds would be for us to make this information public. It would be very difficult, then, to pump up the value of the company with such a cloud hanging over it, and so I think this would be a very effective deterrent. The biggest problem with this plan is the law. Making such information public can lead to big defamation lawsuits directed at the university and perhaps the faculty personally, and I’ve heard of universities losing these lawsuits even if they have documented proof of the fraud. So naturally, universities generally advise faculty against any public declarations of this sort. I don’t know what to do about that. It seems that with the laws set up the way they are, this option is just not viable most of the time.

I think the only real hope is that venture capitalists eventually decide that integrity actually does matter for the bottom line. I certainly don't have any numbers on this, but I know of at least one venture capital firm that claims success rates of 4 in 10 by taking a long view and investing carefully in the success of the people and ventures they fund. I would assume that integrity would matter a lot in that process. And I really do believe that at the end of the day in industry, integrity and reality really do trump hype and salesmanship, just like in academia. I don’t know a lot of CEOs, but one of my heroes is Ron Cook, CEO of Biosearch Technologies, a great scientist, businessman, and a person of integrity. I think it’s not coincidental that Ron has a PhD. For real.

Update in response to comments, 7/22/2016:
Got a comment from Anonymous and Sri saying that this post is overblowing the issue and unfairly impugns the venture capital industry. I would agree that perhaps some elements of this post are a bit overblown, and I certainly have no idea what the extent of this particular issue (knowingly funding fakers) is. This situation probably doesn't come up in the majority of cases, and it may be relatively rare. All that said, I understand that my post is short on specifics and data when it comes to funding known fakers and looking the other way, but I think it will be impossible to get data on this for the very same reason: fear of defamation lawsuits. You just can't say anything specific without being targeted by a defamation suit that you will probably lose even if you have evidence of faking. So where are you going to get this data from?

And it's true that I personally don't have enough anecdotes to consider this data. But I can say that essentially every single person I've discussed this with tells me the same thing: even if you say something, the venture capitalists won't care. In at least some cases, they have specific personal examples.

Also, note that I am not directly calling out the integrity of the venture capitalists themselves, but rather just pointing out that personal integrity of who they fund is not necessarily as big a factor in their decision making as I would have thought. My point is not so much about the integrity of venture capitalists—I suspect they are just optimizing to their objective function, which is return on investment. I just think that it's shady at a societal level that the integrity of who they fund is apparently less important to them than we in academia would hope. Let me ask you this: in your department, would you hire someone on the faculty knowing that they had faked their degree? I'm guessing the answer is no, and for good reason. The question is why those same reasons don't matter when venture capital are deciding who to fund.

Tuesday, June 28, 2016

Reproducibility, reputation and playing the long game in science

Every so often these days, something will come up again about how most research findings are false. Lots of ink has already been spilled on the topic, so I won’t dwell on the reproducibility issue too long, but the whole thing has gotten me thinking more and more about the meaning and consequences of scientific reputation.

Why reputation? Reputation and reproducibility are somewhat related but clearly distinct concepts. In my field (I guess?) of molecular biology, I think that reputation and reproducibility are particularly strongly correlated because the nature of the field is such that perceived reproducibility is heavily tied to the large number of judgement calls you have make in the course of your research. As such, perhaps reputation has evolved as the best way to measure reproducibility in this area.

I think that this stands in stark contrast with the more common diagnosis one sees these days for the problem of irreproducibility, which is that it's all down to statistical innumeracy. Every so often, I’ll see tweets like this (names removed unless claimed by owner):






The implication here is that the problem with all this “cell” biology is that the Ns are so low as to render the results statistically meaningless. The implicit solution to the problem is then “Isn’t data cheap now? Just get more data! It’s all in the analysis, all we need to do is make that reproducible!” Well, if you think that github accounts, pre-registered studies and iPython notebooks will magically solve the reproducibility problem, think again. Better statistical and analysis management practices are of course good, but the excessive focus on these solutions to me ignores the bigger point, which is that, especially in molecular and cellular biology, good judgement about your data and experiments trumps all. (I do find it worrying that statistics has somehow evolved to the point of absolving ourselves of the responsibility for the scientific inferences we make ("But look at the p-value!"). I think this statistical primacy is perhaps part of an bigger—and in my opinion, ill-considered—attempt to systematize and industrialize scientific reasoning, but that’s another discussion.)

Here’s a good example from the (infamous?) study claiming to show that aspartame induces cancer. (I looked this over a while ago given my recently acquired Coke Zero habit. Don’t judge.) Here’s a table summarizing their results:


The authors claim that this shows an effect of increased lymphomas and leukemias in the female rats through the entire dose range of aspartame. And while I haven’t done the stats myself, looking at the numbers, the claim seems statistically valid. But the whole thing really hinges on the one control datapoint for the female rats, which is (seemingly strangely) low compared to virtually everything else. If that number was, say, 17% instead of 8%, I’m guessing essentially all the statistical significance would go away. Is this junk science? Well, I think so, and the FDA agrees. But I would fully agree that this is a judgement call, and in a vacuum would require further study—in particular, to me, it looks like there is some overall increase in cancers in these rats at very high doses, and while it is not statistically significant in their particular statistical treatment, my feeling is that there is something there, although probably just a non-specific effect arising from the crazy high doses they used.

Hey, you might say, that’s not science! Discarding data points because they “seem off” and pulling out statistically weak “trends” for further analysis? Well, whatever, in my experience, that’s how a lot of real (and reproducible) science gets done.

Now, it would be perfectly reasonable of you to disagree with me. After all, in the absence of further data, my inklings are nothing more than an opinion. And in this case, at least we can argue about the data as it is presented. In most papers in molecular biology, you don’t even get to see the data from all experiments they didn’t report for whatever reason. The selective reporting of experiments sounds terrible, and is probably responsible for at least some amount of junky science, but here’s the thing: I think molecular biology would be uninterpretable without it. So many experiments fail or give weird results for so many different reasons, and reporting them all would leave an endless maze that would be impossible to navigate sensibly. (I think this is a consequence of studying complex systems with relatively imprecise—and largely uncalibrated—experimental tools.) Of course, such a system is ripe for abuse, because anyone can easily leave out a key control that doesn’t go their way under the guise of “the cells looked funny that day”, but then again, there are days where the cells really do look funny. So basically, in the end, you are stuck with trust: you have to trust that the person you’re listening to made the right decisions, that they checked all the boxes that you didn’t even know existed, and that they exhibited sound judgement. How do you know what work to follow up on? In a vacuum, hard to say, but that’s where reputation comes in. And when it comes to reputation, I think there’s value in playing the long game.

Reputation comes in a couple different forms. One is public reputation. This is the one you get from talks you give and the papers you publish, and it can suffer from hype and sloppiness. People do still read papers and listen to talks (well, at least sometimes), and eventually they will notice if you cut corners and oversell your claims. Not much to say about this except that one way to get a good public reputation is to, well, do good science! Another important thing is to just be honest. Own up to the limitations of your work, and I’ve found that people will actually respect you more. It’s pretty easy to sniff out someone who’s being disingenuous (as the lawyerly answers from Elizabeth Holmes have shown), and I think people will actually respect you more if you just straight up say what you really think. Plus, it makes people think you’re smart if you show you’ve already thought about all the various problems.

Far more murky is the large gray zone of private reputation, which encompasses all the trust in the work that you don’t see publicly. This is going out to dinner with a colleague and hearing “Oh yeah, so-and-so is really solid”… or “That person did the same experiment 40 times in grad school to get that one result” or “Oh yeah, well, I don’t believe a single word out of that person’s mouth.” All of which I have heard, and don’t let me forget my personal favorite “Mr. Artifact bogus BS guy”. Are these just meaningless rumors? Sometimes, but mostly not. What has been surprising to me is how much signal there is in this reputational gossip relative to noise—when I hear about someone with a shady reputation, I will often hear very similar things independently from multiple sources.

I think this is (rightly) because most scientists know that spreading science gossip about people is generally something to be done with great care (if at all). Nevertheless, I think it serves a very important purpose, because there’s a lot of reputational information that is just hard to share publicly. Many reasons for this, one of them being that the burden of proof for calling someone out publicly is very high, the potential for negative fallout is large, and you can easily develop your own now-very-public reputation for being a bitter, combative pain in the ass. A world in which all scientists called each other out publicly on everything would probably be non-functional.

Of course, this must all be balanced against the very significant negatives to scientific gossip. It is entirely possible that someone could be unfairly smeared in this way, although honestly, I’m not sure how many instances of this I’ve really seen. (I do know of one case in which one scientist supposedly started a whisper campaign against another scientist about their normalization method or something suitably petty, although I have to say the concerns seemed valid to me.)

So how much gossip should we spread? For me, that completely depends on the context. With close friends, well, that’s part of the fun! :) With other folks, I’m of course far more restrained, and I try to stick to what I know firsthand, although it’s impossible to give a straight up rule given the number of factors to weigh. Are they asking for an evaluation of a potential collaborator? Are we discussing a result that they are planning to follow up on in the lab, thus potentially harming a trainee? Will they even care what I say either way? An interesting special case is trainees in the lab. I think they actually stand to benefit greatly from this informal reputational chatter. Not only do they learn who to avoid, but even just knowing the fact that not everyone in science can be trusted is a valuable lesson.

Which leads to another important problem with private reputations: if they are private, what about all the other people who could benefit from that knowledge but don’t have access to it? This failure can manifest in a variety of ways. For people with less access to the scientific establishment (smaller or poorer countries, e.g.), you basically just have to take the literature at face value. The same can be true even within the scientific establishment; for example, in interdisciplinary work, you’ll often have one community that doesn’t know the gossip of another (lots of examples where I’ll meet someone who talks about a whole bogus subfield without realizing it’s bogus). And sometimes you just don’t get wind in time. The damage in terms of time wasted is real. I remember a time when our group was following up a cool-seeming result that ended up being bogus as far as we could tell, and I met a colleague at a conference, told her about it, and she said they saw the same thing. Now two people know, and perhaps the handful of other people that I’ve mentioned this to. That doesn’t seem right.

At this point, I often wonder about a related issue: do these private reputations even matter? I know plenty of scientists with widely-acknowledged bad reputations who are very successful. Why doesn’t it stick? Part of it is that our review systems for papers and grants just don’t accommodate this sort of information. How do you give a rational-sounding review that says “I just don’t believe this”? Some people do give those sorts of reviews, but come across as, again, bitter and combative, so most don’t. Not sure what to do about this problem. In the specific case of publishing papers, I often wonder why journal editors don’t get wind of these issues. Perhaps they just are in the wrong circles? Or maybe there are unspoken union rules about ratting people out to editors? Or maybe it’s just really hard not to send a paper to review if it looks strong on the face of it, and at that point, it’s really hard for reviewers to do anything about it. It is possible that preprints and more public discussion may help with this? Of course, then people would actually have to read each other’s papers…

That said, while the downsides of a bad private reputation may not materialize as often as we feel they should, the good news is that I think the benefits to a good private reputation can be great. If people think you do good, solid work, I think that people will support you even if you’re not always publishing flashy papers and so forth. It’s a legitimate path to success in science, and don’t let the doom and gloomers and quit-lit types tell you otherwise. How to develop and maintain a good private reputation? Well, I think it’s largely the same as maintaining a good public one: do good science and don’t be a jerk. The main difference is that you have to do these things ALL THE TIME. There is no break. Your trainees and mentors will talk. Your colleagues will talk. It’s what you do on a daily basis that will ensure that they all have good things to say about you.

(Side point… I often hear that “Well, in industry, we are held to a different standard, we need things to actually work, unlike in academia.” Maybe. Another blog post on this soon, but I’m not convinced industry is any better than academia in this regard.)

Anyway, in the end, I think that molecular biology is the sort of field in which scientific reputation will remain an integral part of how we assess our science, for better or for worse. Perhaps we should develop a more public culture of calling people out like in physics, but I’m not sure that would necessarily work very well, and I think the hostile nature of discourse in that field contributes to a lack of diversity. Perhaps the ultimate analysis of whether to spread gossip or do something gossip-worthy is just based on what it takes for you to get a good night’s sleep.

Saturday, June 11, 2016

Some thoughts on lab communication

I recently came across this nice post about tough love in science:
https://ambikamath.wordpress.com/2016/05/16/on-tough-love-in-science/
and this passage at the start really stuck out:
My very first task in the lab as an undergrad was to pull layers of fungus off dozens of cups of tomato juice. My second task was PCR, at which I initially excelled. Cock-sure after a week of smaller samples, I remember confidently attempting an 80-reaction PCR, with no positive control. Every single reaction failed… 
I vividly recall a flash of disappointment across the face of one of my PIs, probably mourning all that wasted Taq. That combination—“this happens to all of us, but it really would be best if it didn’t happen again”—was exactly what I needed to keep going and to be more careful.
Now, communication is easy when it's all like "Hey, I've got this awesome idea, what do you think?" "Oh yeah, that's the best idea ever!" "Boo-yah!" [secret handshake followed by football head-butt]. What I love about this quote is how it perfectly highlights how good communication can inspire and reassure, even in a tough situation—and how bad communication can lead to humiliation and disengagement.

I'm sure there are lots of theories and data out there about communication (or not :)), but when it comes down to putting things into practice, I've found that having simple rules or principles is often a lot easier to follow and to quantify. One that has been particularly effective for me is to avoid "you" language, which is the ultimate simple rule: just avoid saying "you"! Now that I've been following that rule for some time and thinking about why it's so effective at improving communication, I think there's a relatively simple principle beneath it that is helpful as well: if you're saying something for someone else's benefit, then good. If you're saying something for your own benefit, then bad. Do more of the former, less of the latter.

How does this work out in practice? Let's take the example from the quote above. As a (disappointed) human being, your instinct is going to be to think "Oh man, how could you have done that!?" A simple application of no-you-language will help you avoid saying this obviously bad thing. But there are counterproductive no-you-language ways to respond as well: "Well, that was disappointing!" "That was a big waste" "I would really double check things before doing that again". Perhaps the first two of these are straightforwardly incorrect, but I think the last one is counterproductive as well. Let's dissect the real reasons you would say "I would really double check before doing that again". Now, of course the trainee is going to be feeling pretty awful—people generally know when they've screwed up, especially if they screwed up bad. Anyone with a brain knows that if you screw up big, you should probably double check and be more careful. So what's the real reasoning behind telling someone to double check? It's basically to say "I noticed you screwed up and you should be more careful." Ah, the hidden you language revealed! What this sentence is really about is giving yourself the opportunity to vent your frustration with the situation.

So what to say? I think the answer is to take a step back, think about the science and the person, and come up with something that is beneficial to the trainee. If they're new, maybe "Running a positive control every time is really a good idea." (unless they already realized that mistake).  Or "Whenever I scale up the reaction, I always check…" These bits of advice often work well when coupled with a personal story, like "I remember when I screwed up one of these big ones early on, and what I found helped me was…". I will sometimes use another mythic figure from the lab's recent past, since I'm old enough now that personal lab stories sound a little too "crazy old grandpa" to be very effective…

It is also possible that there is nothing to learn from this mistake and that it was just, well, a mistake. In which case, there is nothing you can say that is for anyone's benefit other than yourself, and in those situations, it really is just better to say nothing. This can take a lot of discipline, because it's hard not to express those sorts of feelings right when they're hitting you the hardest. But it's worth it. If it's a repeated issue that's really affecting things, there are two options: 1. address it later during a performance review, or 2. don't. Often, with those sorts of issues, there's honestly not much difference in outcome between these options, so maybe it's just better to go with 2.

Another common category of negative communication are all the sundry versions of "I told you so". This is obviously something you say for your own benefit rather than the other person, and indeed it is so clearly accusatory that most folks know not to say this specific phrase. But I think this is just one of a class of what I call "scorekeeping" statements, which are ones that serve only to remind people of who was right or wrong. Like "But I thought we agreed to…" or "Last time I was supposed to…" They're very tempting, because as scientists we are in the business of telling each other that we're right and wrong, but when you're working with someone in the lab, scoring these types of points is corrosive in the long term. Just remember that the next time your PI asks you to change the figure back the other way around for the 4th time… :)

Along those lines, I think it's really important for trainees (not just PIs) to think about how to improve their communication skills as well. One thing I hear often is "Before I was a PI, I got all this training in science, and now I'm suddenly supposed to do all this stuff I wasn't trained for, like managing people". I actually disagree with this. To me, the concept of "managing people" is sort of a misnomer, because in the ideal case, you're not really "managing" anyone at all, but rather working with them as equals. That implies an equal stake in and commitment to productive communications on both ends, which also means that there are opportunities to learn and improve for all parties. I urge trainees to take advantage of those opportunities. Few of us are born with perfect interpersonal skills, especially in work situations, and extra especially in science, where things change and go wrong all the time, practically begging people to assign blame to each other. It's a lot of work, but a little practice and discipline in this area can go a long way.

Wednesday, June 8, 2016

What’s so bad about teeny tiny p-values?

Every so often, I’ll see someone make fun of a really small p-value, usually along with some line like “If your p-value is smaller than 1/(number of molecules in the universe), you must be doing something wrong”. At first, this sounds like a good burn, but thinking about it a bit more, I just don’t get this criticism.

First, the number itself. Is it somehow because the number of molecules in the universe is so large? Perhaps this conjures up some image of “well, this result is saying this effect could never happy anywhere ever in the whole universe by chance—that seems crazy!”, and makes it seem like there’s some flaw in the computation or deduction. Pretty easy to spot the flaw in that logic, of course: configurational space can of course be much larger than the raw number of constituent parts. For example, let’s say I mix some red dye into a cup of water and then pour half of the dyed water into another cup. Now there is some probability that, randomly, all the red dye stays in one cup and no dye goes in the other. That probability is 1/(2^numberOfDyeMolecules), which is clearly going to be a pretty teeny-tiny number.

Here’s another example that may hit a bit closer to home: during cell division, the nuclear envelope breaks down, and so many nuclear molecules (say, lincRNA) must get scattered throughout the cell (and yes, we have observed this to be the case for e.g. Xist and a few others). Then, once the nucleus reforms, those lincRNA seem to be right back in the nucleus. What is the chance that the lincRNA just happened to be back in the nucleus by chance? Well, again, 1/(2^numberOfRNAMolecules) (assuming a 50/50 nucleus/cytoplasm split), which for many lincRNA is probably like 1/1024 or so, but for something like MALAT1, would be 1/(2^2000) or so. I think we can pretty safely reject the hypothesis that there is no active trafficking of MALAT1 back into the nucleus… :)

I think the more substantial concern people raise with these p-values is that when you get something so small, it probably means that you’re not taking into account some sort of systematic error; in other words, the null model isn’t right. For instance, let’s say I measured a slight difference in the concentration of dye molecules in the second cup above. Even a pretty small change will have an infinitesimal p-value, but the most likely scenario is that some systematic error is responsible (like dye getting absorbed by the material on the second glass or the glasses having slightly different transparencies or whatever). In genomics—or basically any study where you are doing a lot of comparisons—the same sort of thing can happen if the null/background model is slightly off for each of a large number of comparisons.

All that said, I still don’t see why people make fun of small p-values. If you have a really strong effect, then it’s entirely possible that you can get such a tiny p-value. In which case, the response is typically “well, if it’s that obvious, then why do any statistics?” Okay, fine, I’m totally down with that! But then we’re basically saying that there’s no really strong effects out there: if you’re doing enough comparisons that you might get one of these tiny p-values, then any strong, real effect must generate one of these p-values, no? In fact, if you don’t get a tiny p-value for one of these multi-factorial comparisons, then you must be looking at something that is only a minor effect at best, like something that only explains a small amount of the variance. Whether that matter or not is a scientific question, not a statistical one, but one thing I can say is that I don’t know many examples (at least in our neck of the molecular biology woods) in which something which was statistically significant but explained only a small amount of the variance was really scientifically meaningful. Perhaps GWAS is a counterexample to my point? Dunno. Regardless, I just don’t see much justification in mocking the teeny tiny p-value.

Oh, and here’s a teeny tiny p-value from our work. Came from comparing some bars in a graph that were so obviously different that only a reviewer would have the good sense to ask for the p-value… ;)


Update, 6/9/2016:
I think that there are a lot of examples that I think illustrate some of these points better. First, not all tiny p-values are necessarily the result of obvious differences or faulty null models in large numbers of comparisons. Take Uri Alon's network motifs work. Following his book, he showed that in the transcriptional network of E. coli (424 nodes, 519 edges), there were 40 examples of autoregulation. Is this higher, lower or equal to what you would expect? Well, maybe you have a good intuitive handle on random network theory, but for me, the fact that this is very far from the null expectation of around 1.2±1.1 autoregulatory motifs (p-value something like 10^-30) is not immediately obvious. One can (and people) quibble about the particular type of random network model, but in the end, the p-values were always teeny tiny and I don't think that is either obvious or unimportant.

Second, the fact that a large number of small effects can give a tiny p-value doesn't automatically discount their significance. My impression from genetics is that many phenotypes are composed of large numbers of small effects. Moreover, the effects of a perturbation of, say, a gene knockout can lead to a large number of small effects. To say those are not meaningful is an (open, to my mind) scientific question, but whether the p-value is small or not is not really relevant.

This is not to say that all tiny p-values mean there's some science worth looking into. Some of the worst offenders are the examples of "binning", where, e.g. half-life of individual genes correlates with some DNA sequence element, R^2=0.21, p=10^-17 (totally made up example, no offense to anyone in this field!). No strong rule comes from this, so who knows if we actually learned something. I suppose an argument can be made either way, but the bottom line is that those are scientific questions, and the size of the p-value is irrelevant. If the p-value were bigger, would that really change anything?

Sunday, May 22, 2016

Spring cleaning, old notebooks, and a little linear algebra problem

Update 5/25/2016: Solution at the bottom

These days, I spend most of my time thinking about microscopes and gene regulation and so forth, which makes it all the more of a surprising coincidence that on the eve of what looks to be a great math-bio symposium here at Penn tomorrow, I was doing some spring cleaning in the attic and happened across a bunch of old notebooks from my undergraduate and graduate school days in math and physics (and a bunch of random personal stuff that I'll save for another day—which is to say, never). I was fully planning to throw all those notebooks away, since of course the last time I really looked at it was probably well over 10 years ago, and I did indeed throw away a couple from some of my less memorable classes. But I was surprised that I actually wanted to keep a hold of most of them.

Why? I think partly that they serve as an (admittedly faint) reminder that I used to actually know how to do some math. It's actually pretty remarkable to me how much we all learn during our time in formal class training, and it is sort of sad how much we forget. I wonder to what degree it's all in there somewhere, and how long it would take me to get back up to speed if necessary. I may never know, but I can say that all that background has definitely shaped me and the way that I approach problems, and I think that's largely been for the best. I often joke in lab about how classes are a waste of time, but it's clear from looking these over that that's definitely not the case.

I also happened across a couple of notebooks that brought back some fond memories. One was Math 250(A?) at Berkeley, then taught by Robin Hartshorne. Now, Hartshorne was a genius. That much was clear on day one, when he looked around the room and precisely counted the number of students in the room (which was around 40 or so) in approximately 0.58 seconds. All the students looked at each other, wondering whether this was such a good idea after all. For those who stuck with it, they got exceptionally clear lectures on group theory, along with by far the hardest problem sets of any class I've taken (except for a differential geometry class I dropped, but that's another story). Of the ten problems assigned every week, I could do maybe one or two, after which I puzzled away, mostly in complete futility, until I went to his very well attended office hours, at which he would give hints to help solve the problems. I can't remember most of the details, but I remember that one of the hints was so incredibly arcane that I couldn't imagine how anyone, ever, could have come up with the answer. I think that Hartshorne knew just how hard all this was, because one time I came to his office hours after a midterm when a bunch of people were going over a particular problem, and I said "Oh yeah, I think I got that one!" and he looked at me with genuine incredulity, at which point I explained my solution. Hartshorne looked relieved, pointed out the flaw, and all went back to normal in the universe. :) Of course, there were a couple kids in that class for whom Hartshorne wouldn't have been surprised to see a solution from, but that wasn't me, for sure.

While rummaging around in that box of old notebooks, I also found some old lecture notes that I really wanted to keep. Many of these are from one of my PhD advisor's, Charlie Peskin, who had some wonderful notes on mathematical physiology, neuroscience, and probability. His ability to explain ideas to students with widely-varying backgrounds was truly incredible, and his notes are so clear and fresh. I also kept notes from a couple of my other undergrad classes that I really loved, notably Dan Roksar's quantum mechanics series, Hirosi Ooguri's statistical mechanics and thermodynamics, and Leo Harrington's set theory class (which was truly mind-bending).

It was also fun to look through a few of the problem sets and midterms that I had taken—particularly odd now to look at some old dusty blue books and imagine how much stress they had caused at the time. I don't remember many of the details, but I somehow still vaguely remembered two problems, one in undergrad, one in grad school as being particularly interesting. The undergrad one was some sort of superconducting sphere problem in my electricity and magnetism course that I can't fully recall, but it had something to do with spherical harmonics. It was a fun problem.

The other was from a homework in a linear algebra class I took in grad school from Sylvia Serfaty, and I did manage to find it hiding in the back of one of my notebooks. A simple-seeming problem: given an n×n matrix A, formulate necessary and sufficient conditions for the 2n×2n matrix B defined as

B = |A A|
    |0 A|

to be diagonalizable. I'll give you a hint that is perhaps what one might guess from the n=1 case: the condition is that A = 0. In that case, sufficiency is trivial (B = 0 is definitely diagonalizable), but showing necessity—i.e., show that if B is diagonalizable, then A=0—is not quite so straightforward. Or, well, there's a tricky way to get it, at least. Free beer to whoever figures it out first with a solution as tricky (or more tricky) than the one I'm thinking of! Will post an answer in a couple days.

Update, 5/25/2016: Here's the solution!


Sunday, May 1, 2016

The long tail of artificial narrow superintelligence

As readers of the blog have probably guessed, there is a distinct strain of futurism in the lab, mostly led by Paul, Ally, Ian and I (everyone else just mostly rolls their eyes, but what do they know?). So it was against this backdrop that we had a heated discussion recently about the implications of AlphaGo.

It started with a discussion I had with someone who is an expert on machine learning and knows a bit of Go, and he said that AlphaGo was a huge PR stunt. He said this based on the fact that the way AlphaGo wins is basically by using deep learning to evaluate board positions really well, while doing a huge number of calculations to determine what play to make to evaluate that position. Is that really “thinking”? Here, opinions were split. Ally was strongly in the camp of this being thinking, and I think her argument was pretty valid. After all, how different is that necessarily from how humans play? They probably think up possible places to go and then evaluate the board position. I was of the opinion that this is a different type of thinking than human thinking entirely.

Thinking about it some more, I think perhaps we’re both right. Using neural networks to read the board is indeed amazing, and a feat that most thought would not be possible for a while. It’s also clear that AlphaGo is doing a huge number of more “traditional” brute force computations of potential moves than Lee Sedol was. The question then becomes how close the neural network part of AlphaGo is compared to Lee Sedol’s intuition, given that the brute force logic parts are probably tipped far in AlphaGo’s favor. This is sort of a hard question to answer, because it’s unclear how closely matched they were. I was, perhaps like many, sort of shocked that Lee Sedol managed to win game 4. Was that a sign that they were not so far apart from each other? Or just a weird flukey sucker punch from Sedol? Hard to say. I think the fact that AlphaGo was probably no match for Sedol a few months prior is probably a strong indication that AlphaGo is not radically stronger than Sedol. So my feeling is that Sedol’s intuition is still perhaps greater than AlphaGo’s, which allowed him to keep up despite such a huge disadvantage is traditional computation power.

Either way, given the trajectory, I’m guessing that within a few months, AlphaGo will be so far superior that no human will ever, ever be able to beat it. Maybe this is through improvements to the neural network or to traditional computation, but whatever the case, it will not be thinking the same way as humans. The point is that it doesn’t matter, as far as playing Go is concerned. We will have (already have?) created the strongest Go player ever.

And I think this is just the beginning. A lot of the discourse around artificial intelligence revolves around the potential for artificial general super-intelligence (like us, but smarter), like a paper-clip making app that will turn the universe into a gigantic stack of paper-clips. I think we will get there, but well before then, I wonder if we’ll be surrounded by so much narrow-sense artificial super-intelligence (like us, but smarter at one particular thing) that life as we know it will be completely altered.

Imagine a world in which there is super-human level performance at various “brain” tasks. What will be the remaining motivation to do those things? Will everything just be a sport or leisure activity (like running for fun)? Right now, we distinguish (perhaps artificially) between what’s deemed “important” and what’s just a game. But what if we had a computer for doing proving math theorems or coming up with algorithms, one vastly better than any human? Could you still have a career as a mathematician? Or would it just be one big math olympiad that we do for fun? I’m now thinking that it’s possible for virtually everything humans think is important and do for "work" could be overtaken by “dumb” artificial narrow super-intelligence, well before the arrival of a conscious general super-intelligence. Hmm.

Anyway, for now, back in our neck of the woods, we've still got a ways to go in getting image segmentation to perform as well as humans. But we’re getting closer! After that, I guess we'll just do segmentation for fun, right? :)