How to ignore <div> inside <td> #267

jonathanyee · 2017-04-25T07:05:17Z

I'm using the scrapTable function and its working fine, however, the site I'm scraping has an extra div inside a cell. How can I ignore this div <div class="nonTablet nonDesktop"><b>12/27/2015</b></div>?

<table class="ledger accountDetail">
    <thead class="nonMobile">
        <tr>
            <th style="width: 14%;">Date</th>
            <th style="width: 44%;">Description</th>
            <th style="width: 17%;" class="right">Debits&nbsp;$ / Credits&nbsp;$</th>
            <th style="width: 25%;" class="right">Current Balance&nbsp;$</th>
        </tr>
    </thead>
    <tbody>
        <tr class="bkgd2">
            <td class="ledgerAccountDetailDesc" style="white-space: nowrap;">12/27/2015</td>
            <td class="ledgerAccountDetailDesc">
                <div class="nonTablet nonDesktop"><b>12/27/2015</b></div>
                Interest Payment	
            </td>

The text was updated successfully, but these errors were encountered:

Yomguithereal · 2017-04-25T07:19:01Z

I am not sure to understand what you mean @jonathanyee. If you just want to retrieve the contained text, why not use the .text method?

jonathanyee · 2017-04-25T18:34:19Z

So when I use the scrapTable function, it parses the date twice. Once from the <td> and once from the <div> inside the next <td>. So I'm wondering how I could ignore the <div> since I dont need the date twice.

Yomguithereal · 2017-04-26T09:03:20Z

I guess at that point you can either use scrapeTable and post-process the created list to drop the unnecessary fields or rather use scrape to do the job with less sugar but more control on the output.

Yomguithereal added the question label Apr 25, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to ignore <div> inside <td> #267

How to ignore <div> inside <td> #267

jonathanyee commented Apr 25, 2017

Yomguithereal commented Apr 25, 2017

jonathanyee commented Apr 25, 2017 •

edited

Loading

Yomguithereal commented Apr 26, 2017

How to ignore <div> inside <td> #267

How to ignore <div> inside <td> #267

Comments

jonathanyee commented Apr 25, 2017

Yomguithereal commented Apr 25, 2017

jonathanyee commented Apr 25, 2017 • edited Loading

Yomguithereal commented Apr 26, 2017

jonathanyee commented Apr 25, 2017 •

edited

Loading