Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support extraction from Doc and Docx #140

Open
fccoelho opened this issue Feb 19, 2013 · 2 comments
Open

Support extraction from Doc and Docx #140

fccoelho opened this issue Feb 19, 2013 · 2 comments

Comments

@fccoelho
Copy link
Member

I have come across a collection of docx files in the demo account of the site. We should add support to this very common mimetype using Antiword or Pyuno.

While this is not supported we should have a filter bolocking the upload of such document types. So I am marking this also as a bug.

@turicas
Copy link
Contributor

turicas commented Nov 11, 2013

For docx format there's python-docx.

@fccoelho
Copy link
Member Author

We need to compare it to other tools such as antiword and pandoc or
libreoffice.
http://johnmacfarlane.net/pandoc/

On Mon, Nov 11, 2013 at 5:47 AM, Álvaro Justen [email protected]:

For docx format there's python-docxhttps://github.com/mikemaccana/python-docx
.


Reply to this email directly or view it on GitHubhttps://github.com//issues/140#issuecomment-28179467
.

Flávio Codeço Coelho

+55(21) 3799-5551
Professor
Escola de Matemática Aplicada
Fundação Getulio Vargas
Praia de Botafogo, 190 sala 312
Rio de Janeiro - RJ
22250-900
Brasil

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants