XMR, LTC, BCH, DOGE Mempool Archive

See this blog post for more details. All files are comma-separated value (CSV) format.

Monero file link

Compressed file size: 21MB

Uncompressed file size: 66MB

Number of columns: 10

Number of rows: 439,766

Litecoin file link

Compressed file size: 132MB

Uncompressed file size: 362MB

Number of columns: 8

Number of rows: 2,875,267

Bitcoin Cash file link

Compressed file size: 16MB

Uncompressed file size: 43MB

Number of columns: 8

Number of rows: 348,779

Dogecoin file link

Compressed file size: 29MB

Uncompressed file size: 24MB

Number of columns: 8

Number of rows: 612,591

The data was collected from 2022-12-21 18:00:00 UTC to 2023-01-18 17:59:59 UTC. Each row in the dataset contains data on a single transaction, except when a block contained no non-coinbase transactions. In that case, a single row is included to represent the empty block.

This script gathered the Monero data. This script processed the Monero data.

This script gathered the LTC, BCH, and DOGE data. This script processed the data of LTC, BCH. and DOGE.

The is_p2pool variable for Monero was built by this script. DataHoarder wrote this script to collect data from centralized mining pools for the Pool variable.

‼️ WARNING: This is not a good dataset for counting the number of blocks each pool mined. Not all pools are included. For the pools that are included, some of them have missing data for some time periods due to API limitations (this is known to affect 2miners, for example).

“canon” is the median time value for all 5 nodes for Monero and the 2 nodes for LTC, BCH, and DOGE. Canon transaction weight and fee is the same recorded value for all nodes. Time is in number of seconds since beginning of Unix epoch (1970-01-01T00:00:00Z).

Column descriptions:

Monero

id_hash                 : chr  Transaction ID hash
canon.receive_time      : num  receive_time field in the get_transaction_pool RPC to monerod
canon.weight            : int  weight field in the get_transaction_pool RPC to monerod
canon.fee               : num  fee field in the get_transaction_pool RPC to monerod
block_height            : int  Height of block that id_hash transaction was confirmed in
canon.block_receive_time: num  Time that node received block, based on polling every second
is_p2pool               : logi TRUE/FALSE block was mined by P2Pool
Pool                    : chr  Name of the mining pool that mined the block
block_num_txes          : num  num_txes field from get_block RPC to monerod
block_reward            : num  reward field from get_block RPC to monerod

Litecoin, Bitcoin Cash, and Dogecoin

id_hash                 : chr  Transaction ID hash
canon.receive_time      : num  time field in the getrawmempool RPC to node daemon
canon.weight            : int  size (vsize for ltc) field in the getrawmempool RPC to node daemon
canon.fee               : num  fee field in the getrawmempool RPC to node daemon
block_height            : int  Height of block that id_hash transaction was confirmed in
canon.block_receive_time: num  Time that node received block, based on polling every second
block_num_txes          : num  Length tx field of getblock RPC, excluding coinbase and MWEB for LTC
block_reward            : num  Value of coinbase transaction output from getblock RPC to node daemon

Pre-fork BTC/BCH Spending Analysis

See this blog post for more details.

spent_status_by_day.csv

File link

Number of columns: 9

Number of rows: 1704

Column descriptions:

block_time.date                : Date, format: "YYYY-MM-DD"
value.btc.unspent.bch.unspent  : num  
outputs.btc.unspent.bch.unspent: int  
value.btc.spent.bch.unspent    : num  
outputs.btc.spent.bch.unspent  : int  
value.btc.unspent.bch.spent    : num  
outputs.btc.unspent.bch.spent  : int  
value.btc.spent.bch.spent      : num  
outputs.btc.spent.bch.spent    : int  

state_trans_by_day.csv

File link

Number of columns: 9

Number of rows: 1704

Column descriptions:

block_time.date : Date, format: "YYYY-MM-DD"
value.ff.to.tf  : num  
outputs.ff.to.tf: int  
value.ff.to.ft  : num  
outputs.ff.to.ft: int  
value.ff.to.tt  : num  
outputs.ff.to.tt: int  
value.ft.to.tt  : num  
outputs.ft.to.tt: num  
value.tf.to.tt  : num  
outputs.tf.to.tt: num  

pre-fork-BTC-BCH-spent_status.zip

File link

Compressed file size: 385MB

Uncompressed file size: 2,713MB

Number of columns: 6

Number of rows: 53,658,348

btc.spent.block_height: int  Block height that the output was spent on the BTC blockchain, if it was spent
bch.spent.block_height: int  Block height that the output was spent on the BCH blockchain, if it was spent
destination_index     : int  Unique integer index of the output that was created by the R script
value                 : num  Value of the output, in bitcoin units
bch.block_time        : POSIXct, format: "YYYY-MM-DD HH:MM:SS" Block time of the BCH block height
btc.block_time        : POSIXct, format: "YYYY-MM-DD HH:MM:SS" Block time of the BTC block height

CashFusion Descendants Analysis

File link

Compressed file size: 749MB

Uncompressed file size: 2,804MB

Number of columns: 5

Number of rows: 29,954,761

A CSV file of the UTXO set of BCH created between block heights of 646085 and 719602 that indicates which are CashFusion descendants and which are not.

See this blog post for more details.

Column descriptions:

txid_position: An identifier of an unspent output, in the form of TXID-position. The position of outputs is indexed from one, not from zero.
tx_graph_index: An integer index for the unspent output that was used in the transaction graph analysis.
value: The value, in BCH, of the output. Zero-valued outputs are included in the dataset.
is_cashfusion_descendant: Takes value of 1 if output is a descendant of a CashFusion transaction and 0 otherwise.
is_coinbase: Takes value of 1 if output is the result of a coinbase transaction and 0 otherwise.