Free as in “you can get it in black”
July 29th, 2008
Free Microsoft tools for scholarly communication:
- This is for real. Don’t mistake the Microsoft research division, which doesn’t sell anything, for the Microsoft product divisions. Tony Hey believes in open access and open data, and is putting Microsoft resources behind them. For background, see Richard Poynder’s interview with Tony Hey (December 2006), and my previous post on the Microsoft repository platform (March 2008).
- The new tools are free of charge. The announcement doesn’t say they will ever be open source, but Microsoft encourages open-source tools in the open chemistry projects it funds. So it’s possible.
Not cross-platform, though. I can’t take any Microsoft division seriously on open anything until they make tools like this simultaneously available on Macintosh and other platforms. Till then, it’s all just marketing bullshit. Apple’s not perfect in this wise, either; but open from Microsoft usually means “loss leader.”
(Via Open Access News.)
Technorati Tags:
geek, mac, open access, microsoft, open data, oss
Hive No-Mind
February 13th, 2008
Kevin Kelly — The Technium: “As Clay Shirky puts it: here comes everybody! “
Yeah, he’s something, that James JoyceClay Shirky.
(Via no via.)
Technorati Tags:
libraries, web, writing, literachoor, james+joyce
Still no E-Z book ripper
February 11th, 2008
Levy: Rip This Book? Not Yet. | Newsweek Voices - Steven Levy | Newsweek.com:
“Then I tested a BookSnap for myself. Short verdict: not a revolution. More a thud than a snap, the device—an ominous three-foot high construction draped with a thick black darkroom-style shade—looks like a Goth puppet theater and weighs 44 pounds. Under the shade is an angled cradle for a book and a glass platen to hold the pages down during scanning. You turn the pages yourself. It costs $1,600, not including the two Canon digital cameras (about $500 each) necessary to capture the page images and send them to your computer, where software transforms the pictures into files that can be read on a screen or an e-book reader. It takes considerable fiddling to get images set up properly. Supposedly, once you get started you can digitize 500 pages per hour, much faster and at higher quality than with flatbed scanners (which are much cheaper but not optimized for book scanning). I never got that far, but I imagine such a feat would require considerable caffeination.”
It’s almost impossible to sell self-digitization to the iPod generation, because - as Levy points out here - it’s so much more labor-intensive than ripping a CD. Even ripping vinyl albums to MP3 is much easier and can also be started and then run mostly unattended. Scanning a book is a tedious process and you can’t really do anything else (well, maybe rip CDs) while you’re doing it. Atiz is commendably trying to get to an appliance model for book scanners, but the BookSnap isn’t it. You’d really need something along the lines of the Kirtas technology for that.
(Via Digitization 101.)
Technorati Tags:
libraries, digitization, e-books
NYPL’s new MyLibraryDV and Macs
February 6th, 2008
Reading the NYPL monthly newsletter this morning, I saw what looked like a great new service: MyLibraryDV. From the newsletter:
Download classic films, Hollywood hits, lifestyle programs, and more — for free! All you need is your NYPL library card, high-speed Internet access, and MyLibraryDV to access more than 1,000 movies and TV series, including favorites like Antiques Roadshow and America’s Test Kitchen.
Well, that, and a Windows machine, or an Intel-equipped Mac with BootCamp, Parallels, or VMWare Fusion:
Can I use a Mac with the service?
The Download Manager for MyLibraryDV is a Windows .exe file that can only be installed on computers running Windows 2000 with SP4 or Windows XP with SP2, which enables you to run Windows Media Player. You can use a Mac to operate the Download Manager and view videos if you have an Intel processor and Windows 2000 with SP4 or Windows XP with SP2 operating system installed and running. Macs without this capability will not be able to install and use the Download Manager.
So the answer here is “not really,” though of course you can make the case that a Mac running Windows does it better and more stably than a PC. (Ask me sometime about the epic struggle it was to burn 3 Word docs to a CD on a Windows laptop yesterday. Why people put up with this stuff is beyond my comprehension. Well, besides “they have to.”) But anybody with a G* is out of luck. NYPL, you’re better than this. Really.
Monster truck info
December 20th, 2007
We have recently begun sending Biodiversity Heritage Library materials to the Internet Archive scanning pod at NYPL. We’re currently trying to get the workflow in place, and so we recently purchased one of these Samson Book Carts to send stuff down. They’re perfect in a lot of ways: rugged, collapsible, huge capacity. Unfortunately, it’s also too tall (by about 4″) to fit in the van we’re using to transport books. I’ve been researching big book carts to no avail - if anyone knows of one similar, but a little shorter, than the samson I’d appreciate knowing about it. Thanks. Isn’t it interesting how 90% of digitization works out to be logistics?
Technorati Tags: digitization, libraries, mpow
“Email is for old people.”
December 7th, 2007
More on the oversimplicity of “Digital Natives” etc. (The Googlization of Everything):
As Henry Jenkins writes, there is so much interesting stuff going one out there among age groups, among members of communities, and across oceans that flattening out everyone into “generations” or “natives” and “immigrants” is just false and useless.
It also has real-world implications. Once we assume that the kids out there love certain forms of interaction and hate others, we forge policies and design systems and devices that meet our presumptions. By doing so, we either pander to some marketing cliche or force an otherwise diverse group of potential users into a one-size-fits-all system that might not meet their needs.
Also see the first comment for the predictable “it is TOO” take on things, replete with the usual ageist assumptions and based mainly on hypotheticals and anecdotal evidence.
Technorati Tags: culture, digitization, geek, libraries
More on the Kindle and privacy
November 29th, 2007
Just when the Kindle is appearing with its own Trust Us approach—Amazon stores everything for itself and maybe unwittingly for Washington—D.C. comes along to remind us of the risk of Big Bro even without the Kindle. Via an AP story, we learn that federal prosecutors sought “the identities of thousands of people who bought used books through online retailer Amazon.com Inc.”
No word on how far they got in the used books. But some other highlights from the post:
Meanwhile Jeff Bezos and friend will be playing do-it-yourself snoops through a TOS specifically authorizing them to poke around your machine to see if you’ve been a good boy or girl. Naughty, naughty, naughty you’ll be if Jeff somehow finds you’ve been bypassing the DRM, and I doubt the punishment will be just a lump of coal. Away could go your Kindle service and book access—just read Amazon’s Terms of Use: “In case of such termination, you must cease all use of the Software and Amazon may immediately revoke your access to the Service or to Digital Content without notice to you and without refund of any fees.”
And since that Kindle’s got no other use but reading e-books that you get from Amazon, you then have a brick. An ugly, beige, $400 brick. But wait, there’s more:
Meanwhile here’s another gem from Jeff’s snoop-friendly terms of service: “The Device Software will provide Amazon with data about your Device and its interaction with the Service (such as available memory, up-time, log files and signal strength) and information related to the content on your Device and your use of it (such as automatic bookmarking of the last page read and content deletions from the Device). Annotations, bookmarks, notes, highlights, or similar markings you make in your Device are backed up through the Service. Information we receive is subject to the Amazon.com Privacy Notice.”
Which privacy policy is then quoted, vague enough that you could easily get sold out to the feds. One thing you can say for the paper book, Amazon can’t turn it off. As much as we might want to get over the pesky inconvenience that privacy poses to the growth of social marketing by denying it exists, there are real and serious consequences to doing so. Relabeling it “identity management” in order to productize it and reduce it to a purely technological problem won’t help either. Just because you might not care who knows what you read doesn’t mean they should find out.
Technorati Tags: amazon, books, ebooks, kindle, libraries, privacy, politics, rss
Social metadata
November 26th, 2007
What I Learned Today… » Blog Archive » The Return of Everything is Miscellaneous:
…Weinberger touches on the future of the ebook. He talked about how we could collect data from how people read books, the passages they highlight, where people read books and so much more using wireless enabled ebook readers (p.222) - and while it sounds like science fiction - we’re almost there. Kindle has the power of wireless technology - meaning that in theory, Amazon could connect to our readers and collect data. While this sounds scary and like a huge invasion of privacy - imagine the power that this data could provide. Some examples Weinberger has is that you could create a list of books that people most often read at the beach or a list of books people stopped reading 1/2 way through - how cool would that be?
Well, because the only people I can think of who would find that data valuable would be marketers. So I don’t think it would be that cool. And it is scary and a huge invasion of privacy. When the government starts asking Amazon for tracking data on where you and your Kindle were last Tuesday, you probably won’t think it’s very cool either. Especially if you can’t turn it off.
Technorati Tags: amazon, digitization, kindle, ebooks, writing
OCRopus Garden
October 25th, 2007
Ars reviews Google’s OCRopus scanning software. We may play with this a bit internally; everybody seems to use Abbyy, but everyone also seems to think that OCR pretty universally sucks, based on the anecdotal evidence I have heard. What I found especially interesting in this review was the huge difference in results from sans-serif rather than serif text:
The following examples show the typical output quality of OCRopus:
Tpo’ much is takgn, much abjdegi qngi tlpugh we arg not pow Wat strength whipl} in old days Moved earth and heaven; that which we are, We are; QpeAequal_tgmper of hqoic hgarts, E/[ade Qeak by Eirpe ang fqte, lgut strong will To strive, to Seek, to hnd, and not to y{eld.
Tho’ much is taken, much abides; and though We are not now that strength which in old days Moved earth and heaven; that which we are, we are; One equal temper of heroic hearts, Made weak by time and fate, but strong in will To strive, to seek, to find, and not to yield
Night and day. Of course almost everything we would possibly be hoping to OCR would be serif text. Ain’t it allus the way.
Technorati Tags: digital libraries, digital_libraries, digitization, google, google books, libraries, linux, ocr, scanning, ubuntu
Wrighting the rong
October 25th, 2007
While reading a Kevin Kelly post about an HG Wells novel that actually was credited in real scientific work, I saw this graphic:

And thought “Cool! A link to the book in the Internet Archive!” Alas, I was wrong. Not only was the image not linked to the IA copy - the image wasn’t linked to anything - the link later in the post was your standard Amazon Associate link. Disappointing. So I’ll right that wrong here:
Go forth and read freely.
Technorati Tags: archive.org, ia, libraries, oca, openlibrary

