Friday, 25 of May of 2012

Category » Events

Two pioneering women programmers

National Center on Women and Information Technology is wrapping up its summit today. I’ve attended much of the first two days, and the presentations on research and projects related to women’s opportunities in computing have been some of the best I’ve encountered anywhere. Lots to write about in those presentations! Let me begin with a little story about two remarkable programmers, Lucy Simon Rakov and Patricia Palombo.

Lucy Simon Rakov and Patricia Palombo were the recipients of the NCWIT Pioneer Award, and girl, were they ever pioneers! These two women were programmers for the Mercury space program, the first to send a person into space and home again. They did all of this with about 120 KB of raw computing power! (The next time I hear some would-be Steve Jobs tell me his code is elegant, I’m gonna laugh in his face.)

Mark Guzdial has a nice post on these great women on his Computing Education Blog:

NCWIT Pioneer Awards to two women of Project Mercury: Following their passions http://bit.ly/Lslrio


My trans-analytic voyage

New piece in a new publication: My trans-analytic Voyage: Text Analytics on Both Sides of the Atlantic contrasts my observations at analytics conferences in the US and Europe.


Chicago Web Analytics Meetup has a new home

The Chicago Web, Game and Social Media Analytics Meetup has been around several years and has developed a substantial membership. Now, the group has a new home. Thoughtworks, a global IT consultancy based in Chicago, will host meetings at their headquarters at 200 East Randolph. Last week, I presented “Crossing the Language Chasm: Extracting Information from Foreign-Language Text” for the group at the new location, and it was a pleasure. The space is roomy, comfortable and a great match for this use. The meeting was well attended, and I expect that the new space will help to build attendance.

If you didn’t get to attend the presentation, you can read the original article on Smart Data Collective:

Crossing the Language Chasm: Extracting Information from Foreign-Language Text


Graph Databases and Analytics

Graph Databases and Analytics

While in London, I attended a talk by Nicki Watt and Michal Bachman, two Open Credo software developers who shared their experience building a recommendation engine based on the Neo4J graph database. I asked why they had chosen that particular platform, and got a simple answer – they didn’t, the decision had been made before they came on to the project. But they did explain some useful things about what graph databases are and what they do well, not to mention what they don’t do so well.

Graph databases are said to be “schema-less”. They don’t have the relatively rigid structure that we expect in relational databases. Instead, they can store a wide variety of information, from numbers to video and more, organized in a relatively flexible structure described by a changeable graph. Neo4J is only one of many such databases, others that you may encounter include MondoDB, AllegroGraph and FlockDB. The advantages of the graph structure include rapid creation of and changes to a database, and excellent performance for many routine operations.

What graph databases aren’t made for is analytics. They don’t lend themselves to operations that might require aggregating large quantities of data, or random sampling, or classical statistical analysis. Analytics can easily bog graph databases down to a standstill.
There are practical situations where you can work around these limitations and end up with good results. So, for example, when making a recommendation, the trick might be to use a relatively small number of easily accessible cases and choose the best among them. Think of how people find partners – getting to know the people in the vicinity and evaluating them as potential partners, rather than traveling far and wide in search of an optimum mate. Another strategy includes serving an old result while waiting for a new one to be calculated, so the user never experiences a long wait for response. Graph databases perform well for transactional applications and those where a quick analysis of a modest number of similar cases fills the bill.

So what about classical statistical analysis, data mining and exploration? What about operations research? My take is that we will still do best to keep as much of that work away from transactional systems as possible, and that planning to create and maintain a relational database for analyst use should be part of the process when architecting new applications.


Text Analytics Summit Europe

Text Analytics Summit Europe took place April 23-24, and I had the opportunity to speak there. My presentation, “Cross-lingual Text Analytics: A New Frontier in Linguistic Technology”, was based on my article of the same title that appeared in Multilingual magazine earlier this year. In that talk, I explained the meaning of “cross-lingual” text analytics, the process and why translating text to feed into English-language text analytics tools is undesirable.

The London group was much more motivated to talk about languages other than English than any audience I’ve encountered in the US! There were several other speakers discussing issues related to non-English text analytics, including some case studies. And the discussion during breaks and such was very different from the US. Americans need to smell the coffee and realize that if we don’t rise up and get into customer engagement and text analytics for languages other than English, we’ll be losing business to international competitors who will get there first. Believe me, they have a huge head start!


Text Analytics Summit Boston

Back from a long road trip and recovered from jetlag, I must now get back to writing! Just finished a piece for Language Technology News http://langtechnews.hivefire.com/, will post a link when that’s available. In the past few weeks I have given three presentations on text analytics – in San Francisco, London and Chicago – and I’ve heard many other interesting speakers, so I have some new stories to tell over the next couple of weeks.

Next up – I’ll be giving the keynote presentation at Text Analytics Summit Boston in June! http://www.textanalyticsnews.com/text-mining-conference/ You can read the conference agenda here: http://www.textanalyticsnews.com/text-mining-conference/conference-agenda.php. Hope to see you there!


The very, very first Social Media Analytics Summit

This week I have been in San Francisco for the very first Social Media Analytics Summit. It was a lively event with lots of solid content, well worth the cross-country trip.

One of my favorite presentations was a panel discussion on the biggest arguments in social media, with Susan Etlinger of Altimeter Group, Lisa Joy Rosner of Netbase and Catherine van Zuylen of Attensity. The experts spoke their minds on touchy topics. I wish I could remember the exact words that Susan Etlinger used to introduce her pet concerns about social media, as near as I can remember, she called them “the four social media myths of the apocalypse:” sentiment, influence, reach and engagement. All popular, and all very elusive measures.

I gave a talk – “Capitalize on Multi-lingual Text Analytics,” a topic which was suggested by Ezra Steinberg, who organized this conference. It was the only talk at this conference on working with languages other than English. At last years’ Text Analytics World in New York, not one presentation was devoted to non-English text. Next week I’ll be in London for Text Analytics Summit Europe, you can bet on a different story there!


Leave a comment

Upcoming presentations

Social Media Analytics Summit, April 17-18, San Francisco
Capitalize on Multi-lingual Social Media Analytics

European Text Analytics Summit, April 23-24, London
Cross-lingual Text Analytics: A New Frontier in Linguistic Technology

Chicago Web, Game and Social Media Analytics Group, May 2, Chicago Free!
Crossing the Language Chasm: Extracting Information from Foreign-Language Text

Predictive Analytics World, June 25-26, Chicago
Cross-Language Text Analytics: Overcoming Language Barriers


From the horse’s mouth

The New York Times report about Target’s pregnancy prediction model has made a splash in 2012, but Predictive Analytics World had it in 2010, presented by Target’s own Andrew Pole. His talk is much clearer on just how hard and imperfect this process really is.See the video.


Sentiment Analysis Symposium May 8, 2012 in New York City

Sentiment Analysis Symposium is coming up May 8 in New York. I attended this event last year in San Francisco, and it was worth the trip.

http://sentimentsymposium.com/