We could cobble some stuff from here #13

weaverbel · 2017-06-02T04:03:04Z

This is from the PYCAR Python course for journalists - from @elainewong
https://github.com/ireapps/pycar

Good basics for us to 'borrow', @richyvk @prcollingwood ?

richyvk · 2017-06-02T04:21:19Z

Def worth a look. might be some 'examples' we can use. I think we have the programming concepts pretty well covered though we what we've got.

prcollingwood · 2017-06-02T04:24:30Z

I think that for the purpose of getting librarians interested in learning more about Python, a shorter basics section like in Elaine's lesson is better. The standard software carpentry lesson spends the whole day on it without showing you why you might want to use Python. Elaine's suggestion of adding a CSV component is good as this shows you something useful you can do with Python.

elliewix · 2017-06-02T15:38:37Z

I've taught python to LIS masters students for two semesters now, and here are some of the tasks that seem to resonate with them (and things I've done myself):

parsing XML files to create new data
comparing two sets of data (e.g. comparing a list of ISBNs to a new list and reporting what is new, missing, duplicated, etc.)
parsing a plain text file and filtering out lines based on a test (e.g. look through a list of bibliographic entries and flag the ones with malformed dewey entries)
parsing through a full text book and splitting all the chapters out into separate files
extracting all the hrefs from a table in a webpage
using regex to extract data points from a semi-structured text file to create a structured data set

jt14den · 2017-06-02T20:22:16Z

@elliewix I like your tasks list. Do you have the data you've used for these?

elliewix · 2017-06-02T20:39:11Z

XML: proceedings of Digital Humanities 2014 are all in TEI and have sufficient complexity to do both basic and juicy things in XPath. http://dh2013.unl.edu/abstracts/index.html There's a zip file of XML docs.
ISBNs: you could use Faker to randomly generate some, or just make some up yourself. The original data was proprietary for that person.
Plain text list of bibs: again proprietary, but you could do some similar things with the Raven, as seen in this lecture: https://github.com/elliewix/LIS452-Spring2017Lectures/blob/master/Week-09.ipynb
full text book: I used dracula from Project Gutenberg for this one, but directions aren't in my full jupyter notebook
hrefs, many things for this one, but you could possibly do wikipeia or other tables.
regex is here using dracula: https://github.com/elliewix/LIS452-Spring2017Lectures/blob/master/Week-15.ipynb

laufers · 2017-06-02T20:56:51Z

The URL parsing is a web-scraping exercise using Beautiful Soup or if the example you mentioned that the urls in a HTML Table structure, using Pandas. Maybe this task gets forwarded to the WebScraping folks? ex: http://ouinformatics.github.io/swc_beautiful_soup/

elliewix · 2017-06-02T21:19:13Z

They already have the task of getting all URLs in there (In [13]). that's a pretty classic example task for all web scraping thing.

richyvk · 2017-06-06T00:12:39Z

@elliewix Like that list of tasks a lot. At least some can be done without external libraries which is good. All probably, but with more code needed.

Anyway, I think we have enough examples there to use throughout the lesson.

harshmangal · 2017-10-19T21:05:41Z

@elliewix how to parse data into separate chapters?

elliewix · 2017-10-19T21:10:57Z

I'm writing a smaller example of this now for my class, which could be more manageable than a full book.

harshmangal · 2017-10-19T21:25:20Z

ok. would you add that one to your github repositories?

elliewix · 2017-10-19T21:26:42Z

I have an in-progress version on the lesson here: https://github.com/elliewix/IS-452-Fall2017/blob/master/Lectures/Week09-While%26sentinelloops.ipynb

c-martinez added the mozsprint label Jun 2, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

We could cobble some stuff from here #13

We could cobble some stuff from here #13

weaverbel commented Jun 2, 2017

richyvk commented Jun 2, 2017

prcollingwood commented Jun 2, 2017

elliewix commented Jun 2, 2017

jt14den commented Jun 2, 2017

elliewix commented Jun 2, 2017

laufers commented Jun 2, 2017

elliewix commented Jun 2, 2017

richyvk commented Jun 6, 2017

harshmangal commented Oct 19, 2017

elliewix commented Oct 19, 2017

harshmangal commented Oct 19, 2017

elliewix commented Oct 19, 2017

We could cobble some stuff from here #13

We could cobble some stuff from here #13

Comments

weaverbel commented Jun 2, 2017

richyvk commented Jun 2, 2017

prcollingwood commented Jun 2, 2017

elliewix commented Jun 2, 2017

jt14den commented Jun 2, 2017

elliewix commented Jun 2, 2017

laufers commented Jun 2, 2017

elliewix commented Jun 2, 2017

richyvk commented Jun 6, 2017

harshmangal commented Oct 19, 2017

elliewix commented Oct 19, 2017

harshmangal commented Oct 19, 2017

elliewix commented Oct 19, 2017