#365Papers - A concept I desperately need

At some point in my search for academic and computational biology blogs to read, somehow I started following an ecology blog (might be because someone said something smart on twitter...). In any case, in a recent post, Brian McGill asked a question I've been struggling with for a long time - namely, how much time should I be spending on reading scientific literature?

The answer is unequivocally "More than I currently do," but it can be daunting at times. I rather like the idea of setting a reasonable goal for the year, and tracking it. A paper a day is probably over ambitious... and I'm not sure if I should count the discussions on TWiX that I listen to via podcast, but those are minor details. 

I've followed Michael Scroggie's idea of setting up an IFTTT recipe to log tweets with the #365papers hashtag. I doubt I'll get close to the 365 mark, but if I can get to even 200...

Might as well write it down for posterity - here's my goal:

  • 25-30 for Audiommunity
  • ~20-30 on computational biology or microbial ecology that I blog about
  • ~100 on computational biology or microbial ecology for my own education.

 The audiommunity goal is definitely reachable, and the 100 for myself is possible (though will require more effort). I don't know if I can reach the blogging threshold, as I've been particularly bad about that in recent months (years), but having the goal should be good motivation.

An Electronic Notebook To Suit My Needs

The Issue

I've been searching for the right tool for a digital laboratory notebook for ages. I've always been back at keeping paper notebooks. When you're repeating procedures with small tweaks, it doesn't make sense to refer to your recipe on page 15 in lab notebook 1. Lots of notes are written on paper towels or post-its, and taping them into a paper notebook is a hassle and not very permanent. Referring to procedures in papers is a pain - and finding anything can be a nightmare if you don't spend hours each week indexing and organizing.

This problem was exacerbated when I was a graduate student in a lab where all of the data I generated (qPCR results, western blots images from a digital device, fluorescence microscopy images) was digital and it's completely untenable now that a great deal of the work I'm doing is writing/running computer code. But keeping track of what you're doing is essential in science, whether at the bench or on the computer, and it would be nice to have an integrated way of doing that for myself and for my lab mates.

Things I've Tried

Towards the end of my Ph.D, I started to play around with Evernote, but I already had so much crap accumulated that it wasn't worth trying to wrap a new system around it. I ended up just making a zip folder of all of my data folders (organized roughly by date). But when I joined the Dutton Lab for my Postdoc, I decided to try to go all-in, and it worked fairly well. 

Evernote is pretty simple to use, but powerful. One of my favorite features was that you could embed a file (like an excel spreadsheet) directly into a note, and then open/edit the file within the note without needing to export/edit/save/import. You can also take hand-written notes and take a picture of them and Evernote will do OCR and make your notes searchable. 

There are a couple of problems that made Evernote untenable going forward. First, I couldn't get buy-in from the rest of the lab. There's a bit of a learning curve, and others weren't willing to invest the time. When you need to collaborate on projects/data sets and others are making changes in other systems, you lose the benefit of keeping everything in one place. Second, the large file types generated when doing genomics work (sometimes >1Gb genome files) don't work that great, and more importantly the files embedded in Evernote are not (easily) accessible from the command line or other programs). Finally, and this is related to problem 2 - there's not a good way to integrate with the workflow for coding (git etc). 

File System/Google Drive/Google Docs:
The nice thing about just storing data on your hard drive and syncing with google drive is that it already comes pretty naturally to most people. You can buy all the space you need (and some universities have deals for unlimited storage), and collaboration is a breeze. For the lab notebook itself, you can use one long Doc that has an automatically-generated table of contents. Added bonus - I've recently gotten really into Paperpile as a reference/paper manager, and the integration is seamless. 

This is basically what the Dutton lab is doing now, and it works great for typical wetlab stuff. Collaboration is easy (though dealing with people that have different philosophies around data management can be tricky), and it's pretty low-impact. The main issue with this is, again, issues integrating with code. You can't use git and an auto-syncing service like google drive at the same time on the same files, and I'm not sure how it would deal with things like Jupyter notebooks. It's strangely hard to get formatted code into a google document. And I'm using a text editor for a great deal of my work, it would be nice to be able to edit my lab notebook there as well. 

Other stuff:
Other people have suggested some other stuff that I've looked into, but isn't quite right. 

  • Microsoft OneNote - similar to Evernote, and suffers from the same deficiencies (I prefer EN)
  • Jupyter notebooks - Great for code, usable with git, not so good for dealing with other types of data files (images etc)
  • Benchling - Built for labs, good sequence viewing/editing when doing small scale stuff (plasmid design etc), not great with large datasets (you've got to pay extra for space), can't edit notebook with my own text editor
  • SciNote - this product isn't available yet (though will supposedly be open source), and seems cool, though it seems like it will suffer from the same problems as Benchling.

What I'd like

I'd like a digital lab notebook that can integrate with the tools I already use for code like Atom/Jupyter. It would be easy to reference files on my hard drive, have explicit version control (like git), and share with colleagues. Unfortunately, I think some of these goals are incompatible. A hybrid of Github for code/Jupyter + a notebook in google drive with links is maybe the best I can hope for. Probably the solution is something that just knits these together a little better. Github and google drive have decent APIs, so it should be possible, I think - but I'm not sure how I would build it. 

Bioinformatics Blogging

One of the reasons that my blogging (particularly for SciAm) has dropped off so precipitously is that the sense of community I used to see in science blogging - that is, in the early days of ScienceBlogs - feels completely absent. Part of the issue is that the commenting system on SciAm stinks (apparently still after the recent revamp), but I think that I'm most to blame here. It takes a lot of effort to build an interactive community of readers, and there are people on the SciAm blogs that have done it very effectively. But I think I fell into the trap of thinking that a bigger audience would mean more comments (without additional work). 

I'm starting to think it might be the opposite - being a bit more niche, and also interacting with other bloggers in a niche topic is a better way to have a community. That's how science blogging started out, and that's what made it so great. I got too enamored with folks like Carl Zimmer and Ed Yong, who can communicate about all kinds of different science topics effectively, and I thought I could do it too. But even if I had the skill, I couldn't ever have the time to do it effectively. 

So I'm going to try to go back to basics. Over the last two years, I've been transitioning to computational biology. I've learned a lot on my own, narrowly searching for answers to my particular questions, but I need to start being exposed to what experts think is important. It turns out that there are a bunch of really excellent bioinformatics blogs out there, and I need to start reading them and interacting on them. That's another thing I forgot - you've got to participate in discussions in other forums to bring discussion to your own, and the drop in my blog writing mirrors (or perhaps trails) the drop in my blog reading. 

So for 2016, I'm purging my long-neglected RSS feeds (unread list = 999+) and Twitter list, and I'm going to try to start from scratch. Blogs I'm starting with (based largely on this reddit thread):

Living in an Ivory Basement
Bits of DNA
Bioinformatician at Large

Let's see how this goes...

EDIT 1/18/16 - I realized shortly after posting this that none of these blogs are by women. I asked Jonathan Eisen on Twitter, and he forwarded to his followers. Got lots of favorites - but only one suggestion (for scienceblogs.com/digitalbio/ )...


Wow... that semester went fast.

Note to self - teaching 3 courses (one of which is brand  new and two of which you're revamping) is sort of hectic. I barely had time to catch my breath all semester, and am only now getting to a place where I feel like I can start to tick off non-urgent things in my to-do list. One of which is blogging (who's shocked? No one). 

The bad news is, the paper that was 85% done in August is only like 90% done now at the end of the year - I barely touched it all semester. It didn't help that my PI moved to San Diego (or that research is not the job that I'm actually getting paid for). The good news is, we're working on it again and I think that actually for real maybe I will have a working draft before the end of January to send around to collaborators. 

New Year's Resolutions - School Edition

Image credit:  wikimedia commons

Image credit: wikimedia commons

I really like the idea of new year's resolutions, though like most people I rarely follow through for any length of time. But this means I'll take any opportunity to make them, not just new calendar years. My birthday (in April) is the start of a new year of life, and now that I'm married I can use my anniversary (November) as the start of a new year of marriage. But I also teach, so I've decided that I'll use the start of a new school year in the same fashion.

Classes at Harvard start on Wednesday, and I'm teaching 3 of them. One is a class on Virology at the extension school (if you want to take it, you can still sign up!), and the other two are the same Masters classes I taught last year. The virology course should be interesting, as there's a distance option where students can watch the course live-streamed and take part in a chat room, or take the course asynchronously by watching the recorded lecture after the fact. It will be an experiment for me, but I'm looking forward to the challenge. 

I'm trying something new in my Masters courses too - I'm going to use Microsoft OneNote course notebooks for student assignments. These notebooks allow me to share assignments and other course material, as well as see all student work in one place. I'm not sure how well it will work, but I thought I should give it a go - having students turn things in via google drive last year did not work as well as I had hoped. 

So what are the resolutions? Same as always really - be more organized, get more stuff done ahead of time (don't procrastinate). I've already started going to the gym again, but that's not teaching related anyway (it helps that the Y is across the street from my apartment). Maybe this year I'll actually stick to it... better get back to preparing for my first class on Thursday!