The easiest thing is getting als the news data, twitter offers a streaming api for free and you can get most news sites as rss feed or whatever,
clasification is the hardest thing you need to clasify the news articels as [good|bad] for [company|industry] or whatever, this could probably be done with things like given a large enough training set. also try to detect multiple news storeis about the same topic, so now you have a news topic clasified as good or bad + popularity of that topic and then you can try to correlate that to the stock price + volume ( realtive to other companies in the same sector )

Could be really hard tough, news about one company might also affect their competitors and so on....
If you make this you could turn this into an awesome trading bot.