Skip to content
This repository has been archived by the owner on Aug 23, 2020. It is now read-only.

OOM exception when trying to revalidate a node from the last global snapshot #1482

Open
GalRogozinski opened this issue Jun 5, 2019 · 2 comments · May be fixed by #1682
Open

OOM exception when trying to revalidate a node from the last global snapshot #1482

GalRogozinski opened this issue Jun 5, 2019 · 2 comments · May be fixed by #1682
Labels
C-Persistence L-Triage Issues that need to be triaged. T-Bug

Comments

@GalRogozinski
Copy link
Contributor

Bug description

When running --revalidate command on a node with a DB from the global snapshot, an OOM appears

IRI version

1.7.0

Hardware Spec

32GB Ram machine, -Xmx 16000m

Errors

Jun 03 12:07:26 perma-2.iota.partners IRI[15150]: 06/03 12:07:26.815 [main] INFO  c.i.i.s.r.RocksDBPersistenceProvider - Initializing Database on mainnetdb
Jun 03 12:07:27 perma-2.iota.partners IRI[15150]: 06/03 12:07:27.471 [main] INFO  c.i.i.s.r.RocksDBPersistenceProvider - RocksDB persistence provider initialized.
Jun 03 12:07:27 perma-2.iota.partners IRI[15150]: 06/03 12:07:27.471 [main] INFO  c.i.i.s.r.RocksDBPersistenceProvider - Deleting: Milestone entries
Jun 03 12:07:27 perma-2.iota.partners IRI[15150]: 06/03 12:07:27.476 [main] INFO  c.i.i.s.r.RocksDBPersistenceProvider - Amount to delete: 1243
Jun 03 12:07:27 perma-2.iota.partners IRI[15150]: 06/03 12:07:27.486 [main] INFO  c.i.i.s.r.RocksDBPersistenceProvider - Deleting: StateDiff entries
Jun 03 12:07:44 perma-2.iota.partners IRI[15150]: 06/03 12:07:44.671 [main] INFO  c.i.i.s.r.RocksDBPersistenceProvider - Amount to delete: 12
Jun 03 12:07:44 perma-2.iota.partners IRI[15150]: 06/03 12:07:44.672 [main] INFO  c.i.i.s.r.RocksDBPersistenceProvider - Deleting: Transaction metadata
Jun 04 15:16:09 perma-2.iota.partners IRI[15150]: Exception in thread "main" Exception in thread "iothread-2" java.lang.OutOfMemoryError: Java heap space
Jun 04 15:16:09 perma-2.iota.partners IRI[15150]:         at org.rocksdb.RocksIterator.key0(Native Method)
Jun 04 15:16:09 perma-2.iota.partners IRI[15150]:         at org.rocksdb.RocksIterator.key(RocksIterator.java:37)
Jun 04 15:16:09 perma-2.iota.partners IRI[15150]:         at com.iota.iri.storage.rocksDB.RocksDBPersistenceProvider.flushHandle(RocksDBPersistenceProvider.java:350)
Jun 04 15:16:09 perma-2.iota.partners IRI[15150]:         at com.iota.iri.storage.rocksDB.RocksDBPersistenceProvider.clearMetadata(RocksDBPersistenceProvider.java:329)
@jakubcech jakubcech added the L-Triage Issues that need to be triaged. label Jun 5, 2019
@jakubcech
Copy link
Contributor

#1391 so it's linked

@GalRogozinski
Copy link
Contributor Author

It probably happens because of how RocksDbPersistanceProvider#FlushHandle is implemented:

private void flushHandle(ColumnFamilyHandle handle) throws RocksDBException {
        List<byte[]> itemsToDelete = new ArrayList<>();
        try (RocksIterator iterator = db.newIterator(handle)) {

            for (iterator.seekToLast(); iterator.isValid(); iterator.prev()) {
                itemsToDelete.add(iterator.key());
            }
        }
        if (!itemsToDelete.isEmpty()) {
            log.info("Amount to delete: " + itemsToDelete.size());
        }
        int counter = 0;
        for (byte[] itemToDelete : itemsToDelete) {
            if (++counter % 10000 == 0) {
                log.info("Deleted: {}", counter);
            }
            db.delete(handle, itemToDelete);
        }
    }

As we can see itemsToDelete needlessly grows very large. It is also weird that we delete 1 by 1.
I suggest that after it grows to the size of 10000 (or maybe another magic number) we do a delete batch operation and clear itemsToDelete.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C-Persistence L-Triage Issues that need to be triaged. T-Bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants