30 September 2013

On science writing: Gender

An interesting little tool popped up in my newsfeed the other day which only served to fuel my preoccupation with writing style and clarity. Gender Guesser is a system which estimates the gender of a writer based on a submission of at least 300 words of text. The estimation is based on word frequencies and parts of speech. I won't talk more about the specifics of the algorithm, except to say that the original research doesn't seem to include any discussion of science writing in particular.

Being a somewhat obsessive data collector, I proceeded to submit a broad selection of my own writing to the online interface. An example of my results appears below (this result is actually from the same blog post I wrote about revising last week).

I'm not really surprised that nearly all of my writing estimates that I'm a male, often with very high (i.e., >90%) confidence for both informal and formal writing. I tested ~20 writing samples with appropriate word lengths, including posts from this blog, personal writing, and even excerpts from my last publication, for which I am sole author. At best, I am only scored as weakly female (the semantics of which are another issue altogether). The only exception is a blog post from over four years ago. 

What are the implications? The authors of the web interface for Gender Guesser note that females writing in fields which are dominated by males (of which I believe biology qualifies) will tend to score as male. Have I been trained to write in a more masculine manner? Moreover, do I really care if my writing possesses masculine characteristics? Perhaps a more important component to this discussion is what style of writing is more appealing to a wide breadth of readers, or whether readers purposely or subconsciously discern the gender of a writer from an anonymous sample. 

