Detecting Gender by Prose and How to Make Everyday Things
By PAMELA LiCALZI O'CONNELL
Does your writing style indicate whether you are male or female? In a development reported in The New York Times
Magazine and elsewhere, Israeli researchers recently created an
algorithm that can predict the sex of an author from a writing sample.
After the news coverage, some cheeky members of an online book
discussion site, BookBlog, turned the algorithm into a Web application
called Gender Genie (www.bookblog.net/gender/genie.html)
that allows anyone to punch in some prose and receive a verdict. That's
when the fun started. Since Aug. 15, Gender Genie has analyzed 250,000
documents submitted by 125,000 people, according to Mary Delli Santi,
who runs the site. But although the researchers achieved an 80 percent
accuracy rate, Gender Genie correctly guessed an author's sex only
about half the time.
by e-mail, one of the scientists, Moshe Koppel, argued that the
algorithm has not been applied correctly. (The researchers are working
with Ms. Delli Santi on revising the Web version.) He also said that
people were entering texts that were too short and ignoring the fact
that the algorithm was designed to assess fiction, not blog entries,
e-mail or the other sundry nonfiction samples that dominate submissions
(for example, the Gender Genie said this column was written by a man).
beyond the accuracy of the algorithm, the draw of Gender Genie has been
to see how differently men and women react to the site's sexual
judgments. "Men are definitely more sensitive about having their
writing labeled as female" than the other way around, Ms. Delli Santi
said. And some find the results too stark: "Transgendered people are
asking that the Genie come up with more results than just male or
All sorts of fun and
funky applications related to Amazon.com are popping up, only a
fraction of them from the company's development team.
over a year ago, Amazon provided outside programmers with a free
interface for interacting with its vast product database. (Google and eBay
have done the same.) The result has been a creative explosion of
ancillary Web sites and services, as detailed in "Amazon Hacks," a new
by Paul Bausch. "People should now think of Amazon as a technology
platform," Mr. Bausch said. "The company does not know or control how
people use its data. It's a bit surprising the way Amazon has embraced
Some applications build on features that Amazon already
provides. For example, Amazon offers a list of its top-selling items
over the last 24 hours. But what if you want to track the sales rank of
specific items over time? The independent site JungleScan.com allows
anyone to set up a free portfolio to do just that. (Mr. Bausch uses it
to track his book sales, for instance.)
Another of Mr. Bausch's
favorite hacks (the book lists 100) lets him call up his Amazon Wish
List from his cellphone. "You can see why Amazon would not create such
an application itself," he said. "Why would they want you to access
your Wish List while you were standing in Barnes & Noble?"
of the hacks are a bit silly but indulge the desire of users to treat
Amazon as a blank slate on which to project their own obsessive
interests. Take the application called MP3 Piranha (www.capescience.com/piranha).
If you forgo buying CD's to collect MP3 music files on your computer,
you are missing out on the original album cover art. This free
application asks Amazon to automatically pull the cover art for the
albums from which your song collection is drawn.
Make My Day
How is a jelly bean made? A plastic bottle? An airplane? At the eye-opening site How Everyday Things Are Made
(manufacturing.stanford.edu), more than 40 products and manufacturing processes are explained. Think of it as the mother of all factory tours.
new site, based on a course offered annually to non-engineers at
Stanford, provides nearly four hours of video of industrial plants. (A
high-bandwidth connection to the Internet is necessary.) "We tried to
focus on which processes are most mysterious to people," said Mark V.
Martin, a lecturer at Stanford and the chief executive of design4X, the
company that developed the site with the Alliance for Innovative
The site's interactive features include discussion
boards and, fortunately, a slow-motion dial, since some factories
appear to operate at warp speed. Recurring sections ask site visitors
to apply the knowledge they have just acquired to a new problem ("How
would you get nuts or a creamy filling inside a chocolate candy bar?").
You must submit a response to find out the answer.
would like to create a version of the site for younger children that
would focus more on toys and food. Yet "the food industry is the
hardest industry to get information out of," he said. "Apparently the
breakfast-cereal people have a lot of proprietary manufacturing
On the Radar
acceptance speeches, sermons and political orations are a few examples
of what can be found in the "online speech bank" at
americanrhetoric.com. Don't skip the debatable list of the top 100 speeches. Imaging Everest (imagingeverest.rgs.org)
offers 20,000 photos from expeditions up that majestic mount. Most of
what can be found at United States Government Graphics and Photos (firstgov.gov/Topics/Graphics.shtml) is available for use in the public domain. Paste away!