Skip to content

html5lib #3

@jackpolymath

Description

@jackpolymath

Thanks for this code. Had a problem with "html.parser" on the results of the site http://www.freepatentsonline.com/search.html. When you run a search on that site, it returns a table of results. The html.parser seems to break the results table (--nth=2). On my own machine, I changed the get_soup function to use the "html5lib" parser and your code worked correctly. I'll leave it you to change your own github code. Maybe include a second parameter (i.e. --parser=html5lib) or import html5lib in tf1.py (BeautifulSoup seems to use html5lib if available, otherwise html.parser as a default). Thanks again.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions