Unlocking DRM Lets You Open Multiple eBooks Simultaneously

The Amazon Kindle, Apple iPad and other e-readers are fast becoming mainstream and their usability has improved tremendously over the past years. However there is one area in which printed books are still much better: the ability to open multiple books at once. This might not matter if you are reading the latest “50 shades” novel and want to be uninterrupted. However, if you are working on a research project and constantly need to switch across multiple books, you will find that current eBook readers are a nightmare. Switching eBooks involves creating bookmarks, returning to a main menu (library page), going to another book and navigating it. This quickly becomes tedious. I cannot understand why tabbed browsing is absent from eBook software since it is rudimentary and exists in practically every web browser.

One solution is to buy multiple eBook readers and open one book per device. This turns out to work quite well. One might argue that the savings from not having to ship printed books will more than cover the cost of additional eBook readers. However it occurred to me recently that another solution exists: simply remove the DRM from your existing books. This is really easy to do. You can then manage your books using software like calibre, which allows multiple eBooks to be opened at the same time. On a fast computer with a large screen, this is a liberating experience! A 27″ or 30″ screen is sufficient to give me as good an experience as with 3-4 printed books. You can even do things that you cannot with regular books (without mutilating them) such as opening multiple instances of the same book for quick cross-referencing across different sections. If you take the extra step and export your library into pdf format, you then have the ability to manage, annotate and search your eBooks using software like Papers 2, treating them just like any other pdf file and merging them with your collection of journal articles.

There are other benefits of unlocking DRM, including the ability to prevent vendor lock-in (e.g., read your Amazon ebooks using Apple iBooks), avoid arbitrary and unfair removal of your books, and to overcome silly device download limits. For some of us, opening multiple books at the same time is another big plus. I suspect that over time, eBook DRM will go away. We are at the stage of the eBook industry that we were at with music 10 years ago, when we had to rip music from our personal CD collections or the proprietary formats on iTunes and convert them into unlocked files that were more flexible. Today music is sold unlocked and I don’t see why it should end up otherwise with eBooks.

(ps: yes I know eBooks are licensed, not sold, but lets save that for another discussion).

Reading multiple books at once
Your 30″ monitor can show all these books at the same time

Cloud based Econometrics and Statistics Software

regressWhile the rest of the world was busy with Apple’s iCloud, I spent the past few weeks working on a large-scale empirical project.

In the process I learnt a few things about cloud-based options for statistics and econometrics. The situation has developed quite a bit since Robert Grossman’s earlier post on using Amazon’s cloud for this purpose. Amazon now has a  browser-based graphical dashboard to easily manage your cloud-based machines, instead of relying upon command line tools.

In my view, there are three relevant areas that are at different stages of cloud-readiness for empirical economists and statisticians:

1. Databases and datasets

Cloud based solutions are great especially for large datasets that require scaleability. They are also good for research projects that require multiple people to access them (e.g., if your project involves multiple coauthors or research assistants).

For simple projects with small datasets, a shared spreadsheet on Google Docs should suffice. For larger datasets, one good option is Amazon RDS, which is price-competitive and offers both SQL and Oracle databases; it is easy to maintain and backup. Another option is Microsoft Azure. We use Postgresql and Ubuntu on EC2 for analysing patent data.

One advantage of cloud based databases is that the technology is now mature. Demand is driven by many other business and scientific applications. We therefore benefit from positive knowledge spillovers. It is relatively easy and inexpensive to hire an RA with a computer science background to build an SQL-based dataset. A second advantage is that a number of other research databases are now starting to appear in the cloud making them easy to interface with cloud-based programming. This includes data on patents, genes, the US Federal Reserve and Census data. The number of research databases in economics and the social sciences is still small, but growing.

2. Regression and statistical software

This is where it is disappointing. Most of the popular software packages are not widely affordable for cloud use, including Stata and SPSS. You hit licensing snags. A small number of private service providers bridge this gap by offering High Performance Computing (HPC solutions), e.g., for Mathematica, but they are pretty expensive (at least to an academic researcher). Matlab will work but requires a ‘distributed server’ license that will cost a fortune. In general, these software companies want to sell you a 2-core or 4-core license that will run year-long on that computer on your desktop. What some of us need instead is a license that will run on 64 cores across 16 machines for just one month during which we are doing intensive number-crunching. More importantly, we want that license to be easy to transact, not to go through a complicated application and registration process. You might think this sort of licensing doesn’t exist, but I would argue that it is already happening, including with software such as Microsoft Windows Server and Oracle, which you can now rent on Amazon’s AWS cloud for whatever length of time you want, and with no transaction costs.

As a result of these issues, if you are on a budget your best bet is the open source “R Project” which is a statistical and econometrics toolkit that is growing by leaps and bounds in its popularity. It runs in the Amazon cloud on both Linux and Windows. By combining R with a software technique known as MapReduce, you can easily split your program into portions that are run on multiple computers and have the results aggregated back elegantly. Here is a good example of using R with MapReduce by Stephen Barr, and another by Jeffrey Breen. I will be looking more into using more of this in my projects.

3. Cloud-based programming

Instead of running mathematical or analytical programs on your desktop, you can run it in the cloud. This works best if you can partition the problem into little chunks that can be worked upon independently. For example we use Perl for text processing of patent data. I know of people who code in Fortran/IMSL or C and generate binaries for optimization and numerical simulations. It is nice to be able to activate a dozen machines to process the data quickly instead of waiting a week for the results.

Other considerations

A side benefit of this approach is a quiet office. Some years back, I had a powerful workstation in my office with an 8-disk RAID array, multiple CPUs and dual power supply units. It was really noisy! Also, the cleaner had a habit of switching it off, ruining my calculations. Migrating my data analysis into the cloud allows me to now have a quiet and peaceful office, where I can think and write.

If you have any thoughts/comments about cloud based solutions, or know of other useful resources or tips, please share them in the comments below. Thanks.

e-books are overtaking printed books

Australia Radio National recently did a radio program on e-books at the Brisbane Writers Festival. Of the 4 panelists, only one actually owned an electronic book reader. A number of benefits were cited of e-books, including convenience of purchase, lower book prices (especially compared to the prices of printed books in Australia), and better access from rural locations. However, the overall the impression was that printed books and traditional bookstores will continue to exist for some time. One of the panelists stated that printed books will still constitute 70% of the market within a decade. Another panelist felt that bookshops will continue to exist because they are a nexus of social activity.

Let me be the first to say I love bookshops and have a large library of printed books. That said, these people clearly did not get the memo from Jeff Bezos that the number of e-books sold by Amazon has already overtaken hardcover books and it will overtake paperbacks by next year. The recent launch of the ipad, multimedia e-books, and this week’s launch of the third generation Kindle (only US$139) are going to accelerate the process. Having used both e-books and printed books for some time, all I can say is that many of the complaints people mentioned in the podcast have been addressed, or are being addressed, in the newer ebook readers. Change is happening faster than many people think. This week alone I bought 7 books on Kindle for a course I’m teaching, and I have no complaints.

One way to address the gap between perception and reality is to allow more customers to get their hands on an e-book reader, such as at retail outlets and other public places. From personal experience, people who complain about e-books are often surprised by how usable they are after I’ve put an actual device into their hands for the first time. I’ve also noticed that at a lot of places where e-book readers are sold, they are displayed all wrapped up or inside glass cabinets, rather than in a way that invites people to experience them. This is is something e-book retailers such as Amazon and B&N should address, maybe taking a page out of Apple‘s book to make the shopping experience much more hands-on.

Internet Sales and Copyright

Last month, CMCL and IPRIA organized an event to discuss whether internet service providers should be responsible when their customers upload/download illegal material. I wasn’t able to attend, but Sarah Berriman, one of my young and energetic students, did. She wrote to me expressing how a ‘generation gap’ exists in the way the music industry perceives its customers.

The main points made by each speaker were interesting, but also not unexpected. There seems to be a disconnect between the copyright holders and their customers.  They don’t seem able to adapt to the needs and desires of the current generation, and it is a mystery to many why this should be.  It is not as though broad and expensive market research is required to tell them this: the cinema is perceived as poor value for money.  People like convenience, and having things when they want and how they want.  Is it really such a leap to move to simultaneous international online sales?  Is copyright the issue here, or the protection of an outdated business model?  Perhaps it is time infringers were offered a carrot, instead of perennially getting the stick. [reproduced with permission from Sarah Berriman, Medical Student, Melbourne University]

Today I received a stack of books from amazon.com. They were shipped all the way across the world from Kentucky in the United States. And no less by the same company that refuses to sell me the very same books as electronic downloads because we are outside the United States. I am glad I’m young enough to fall into the same side of the generation divide as Sarah.