Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to ignore <div> inside <td> #267

Open
jonathanyee opened this issue Apr 25, 2017 · 3 comments
Open

How to ignore <div> inside <td> #267

jonathanyee opened this issue Apr 25, 2017 · 3 comments
Labels

Comments

@jonathanyee
Copy link

I'm using the scrapTable function and its working fine, however, the site I'm scraping has an extra div inside a cell. How can I ignore this div <div class="nonTablet nonDesktop"><b>12/27/2015</b></div>?

<table class="ledger accountDetail">
    <thead class="nonMobile">
        <tr>
            <th style="width: 14%;">Date</th>
            <th style="width: 44%;">Description</th>
            <th style="width: 17%;" class="right">Debits&nbsp;$ / Credits&nbsp;$</th>
            <th style="width: 25%;" class="right">Current Balance&nbsp;$</th>
        </tr>
    </thead>
    <tbody>
        <tr class="bkgd2">
            <td class="ledgerAccountDetailDesc" style="white-space: nowrap;">12/27/2015</td>
            <td class="ledgerAccountDetailDesc">
                <div class="nonTablet nonDesktop"><b>12/27/2015</b></div>
                Interest Payment	
            </td>
@Yomguithereal
Copy link
Member

I am not sure to understand what you mean @jonathanyee. If you just want to retrieve the contained text, why not use the .text method?

@jonathanyee
Copy link
Author

jonathanyee commented Apr 25, 2017

So when I use the scrapTable function, it parses the date twice. Once from the <td> and once from the <div> inside the next <td>. So I'm wondering how I could ignore the <div> since I dont need the date twice.

@Yomguithereal
Copy link
Member

I guess at that point you can either use scrapeTable and post-process the created list to drop the unnecessary fields or rather use scrape to do the job with less sugar but more control on the output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants