[BNM] possible work
Jason Bailey
j.bailey at sussex.ac.uk
Thu Oct 5 09:06:35 BST 2006
Hello,
Would this be of interest to anyone? I've removed any references to names
or departments at Sussex just in case (not sure why?). I've not edited it
grammar/spelling wise.
Let me know if you might be interested in the programming bit? I guess
they're after a freelancer and it's fine to be up front with fees etc as
they will have to cost this in the research proposal. I'm not sure if
anything will come of it but no harm in trying.
Jason ...
> Me and my colleagues (one of them is from ******** at
> Sussex Uni.) are planning to write an ****** research proposal related
> to analysing huge amount of publications as well as patents textual data.
> Obviously, there are two IT-related technical aspects which we should
> tackle. The first one is related with gathering/downloading huge amount of
> information from the internet to your database and the second one is
> related with analysing the textual records. We seems to agree to acquire
> dedicated commercial software such as Polyanalyst or SAS Text Mining or
> even SPSS Clementine to deal with the second issues. However, we still
> have the problem on how to "download" automatically huge amount of data
> repetitively (Scrape, Spidering, whatever you may call it) from the
> internet. Do you know somebody/student who are able to do this "Web-Client
> Side programming" and are willing to get involve in a research project ? I
> myself am able to built such automaton/robot with PERL but I only do that
> to simple websites (websites which use only HTML). Our problem is that
> many of the sites we would like to scrape use scripting languange (such as
> Javascript) as well as Php which obscure the page links. Building robot to
> scrape such websites is beyond my skill
--
Jason Bailey
IT Services
University of Sussex
http://www.sussex.ac.uk/USIS/phone/details.php?id=17011
More information about the BNMList mailing list
BNMList is hosted by Screenlists, a Screen-Play.net service