rieMiner - Solo + pooled Riecoin mining

Riecoin mining software & pools
Post Reply
Pttn
Posts: 131
Joined: 24 Aug 2018, 13:37

Re: rieMiner - Solo + pooled Riecoin mining

Post by Pttn » 19 Nov 2018, 12:01

As nobody seem to be interested in fixing completely the CPU Underuse Bug, as rieMiner was confirmed working properly by many people, and as gatra is not caring about Riecoin since a long time, there is no reason anymore to wait for the official 0.16.3, and I plan to release the first stable rieMiner version this weekend or the next one, even though the only practical change for miners since Bêta 3 will be a significantly faster initialization for bigger Sieves. There will however probably be a significant code refactoring to make it even more robust for future developments.

For now, I am bumping the version status to Release Candidate.
rieMiner - Riecoin solo + pooled miner
Personal Riecoin page (links, download,...)
freebitco.in - earn up to $200 in BTC each hour!

Rockhawk
Posts: 48
Joined: 29 Oct 2018, 21:12

Re: rieMiner - Solo + pooled Riecoin mining

Post by Rockhawk » 20 Nov 2018, 09:56

While thinking about the possibility of reusing the mining code for prime searches outside of Riecoin, I thought of a different way to parallelise the sieving. This significantly reduces the amount of synchronization required between threads, and means a single instance can run efficiently across more threads.

This change does come with a memory cost per sieve thread, but overall memory usage for one 16 thread instance will still be less than it is now with two 8 thread instances. I've also added configuration so you can directly control the number of sieve workers from the config file, which allows you to trade off keeping the verification threads busy against the increased memory usage per worker.

Pttn, if you are interested, please take a look at my feature-better-sieve branch. I won't merge in the refactoring changes and do a PR unless you report good results with it.

Pttn
Posts: 131
Joined: 24 Aug 2018, 13:37

Re: rieMiner - Solo + pooled Riecoin mining

Post by Pttn » 20 Nov 2018, 10:50

It was very clever to make threads working with different offsets! The CPU Underuse bug completely disappeared with 16 threads at current difficulty! Quick testing seem to not make any performance loss (there even seem to be a performance increase of a few %), but I will do a more rigorous test later to confirm.

PasteBin of the quick test. In comparison, I got ~314 pps before ending the two 8 threads instances without your code vs ~327 pps (2700X @3.7 GHz), but I need to test in a more stable situation (fresh boot).

Do you think that we could get rid completely of the CPU Underuse Bug simply by adding more offsets? I wonder how your solution will work for 32 Threads and more, but for me with 16 Threads, it seems to work pretty fine for the current difficulties!
rieMiner - Riecoin solo + pooled miner
Personal Riecoin page (links, download,...)
freebitco.in - earn up to $200 in BTC each hour!

Rockhawk
Posts: 48
Joined: 29 Oct 2018, 21:12

Re: rieMiner - Solo + pooled Riecoin mining

Post by Rockhawk » 20 Nov 2018, 11:44

Thanks :)

There is a balance between adding more offsets and both memory usage and the time to starting verifications after beginning work on a block, because remainders for each offset have to be calculated before sieving can start (this is cheaper than calculating remainders for completely different blocks, but still has some cost). For low difficulties or higher sieve max though it is good to add more sieve workers than the default configuration, which is why I made it configurable. I guess you could store the remainder results and then generate new sieves using more offsets based on the stored results when the sieve finishes, but OTOH it's good to refresh the block occasionally.

For all practical cases I think this solves the CPU underuse issue, except there's a blip in CPU usage while the sieves get going after a block height change, and if maxWorkOut (a parameter in Miner::process()) is too low there's a blip when switching blocks too. However, if maxWorkOut is too high then you increase the number of candidates thrown away when the block height changes, so mining is less efficient (though benchmarks are unaffected). I'd like to change it so that this parameter is adjusted automatically rather than being hard coded.

Another change I'd like to make is to look for primorial offsets closer together, as there is a slight performance loss for larger offset differences. Currently I'm using the first prime in real prime sextuplets, but it is actually sufficient to use the first number in constellations for which none of the primes in the primorial divide any of the candidates, so it's likely a set of offsets that are much closer together could be found.

I'd also like to add configuration for sieveBits, as the best value for this depends on the size of your L3 cache and the number of prime workers you're using.

I'll look at getting those things done plus the merge of your refactoring changes over the next day or two.

Rockhawk
Posts: 48
Joined: 29 Oct 2018, 21:12

Re: rieMiner - Solo + pooled Riecoin mining

Post by Rockhawk » 20 Nov 2018, 12:11

Actually, the idea of storing the remainder results to be reused on further sieve iterations has given me an idea of how to better split the work between the mod and sieve jobs to avoid the use of the bucket lock, which currently costs some performance when starting processing on a new block.

As long as enough offsets were configured this would mean you could keep proecssing one block for as long as you liked with slightly better performance than getting an updated block. I guess we could ask the client if the total transaction fee had increased by more than some threshold, and if not we could keep processing the same data?

Pttn
Posts: 131
Joined: 24 Aug 2018, 13:37

Re: rieMiner - Solo + pooled Riecoin mining

Post by Pttn » 20 Nov 2018, 13:42

Rockhawk wrote:
20 Nov 2018, 12:11
Actually, the idea of storing the remainder results to be reused on further sieve iterations has given me an idea of how to better split the work between the mod and sieve jobs to avoid the use of the bucket lock, which currently costs some performance when starting processing on a new block.

As long as enough offsets were configured this would mean you could keep proecssing one block for as long as you liked with slightly better performance than getting an updated block. I guess we could ask the client if the total transaction fee had increased by more than some threshold, and if not we could keep processing the same data?
With the GetBlockTemplate implementation, you can check if the block contains new transactions with the WorkData's txCount attribute, but there is no trivial way to do this in Stratum other than compare some hash in the Block Header (or you could check the JobId, though I am not sure when it changes). For checking the fees outside the Client, you need to decode the Coinbase in the Block Header.

And think about the future, if Riecoin starts to be used a lot, there will be a lot of transactions anyway, so this is already future proof to update the work that regularly.
rieMiner - Riecoin solo + pooled miner
Personal Riecoin page (links, download,...)
freebitco.in - earn up to $200 in BTC each hour!

Pttn
Posts: 131
Joined: 24 Aug 2018, 13:37

Re: rieMiner - Solo + pooled Riecoin mining

Post by Pttn » 20 Nov 2018, 20:01

Rockhawk wrote:
20 Nov 2018, 11:44
Thanks :)

There is a balance between adding more offsets and both memory usage and the time to starting verifications after beginning work on a block, because remainders for each offset have to be calculated before sieving can start (this is cheaper than calculating remainders for completely different blocks, but still has some cost). For low difficulties or higher sieve max though it is good to add more sieve workers than the default configuration, which is why I made it configurable. I guess you could store the remainder results and then generate new sieves using more offsets based on the stored results when the sieve finishes, but OTOH it's good to refresh the block occasionally.

For all practical cases I think this solves the CPU underuse issue, except there's a blip in CPU usage while the sieves get going after a block height change, and if maxWorkOut (a parameter in Miner::process()) is too low there's a blip when switching blocks too. However, if maxWorkOut is too high then you increase the number of candidates thrown away when the block height changes, so mining is less efficient (though benchmarks are unaffected). I'd like to change it so that this parameter is adjusted automatically rather than being hard coded.

Another change I'd like to make is to look for primorial offsets closer together, as there is a slight performance loss for larger offset differences. Currently I'm using the first prime in real prime sextuplets, but it is actually sufficient to use the first number in constellations for which none of the primes in the primorial divide any of the candidates, so it's likely a set of offsets that are much closer together could be found.

I'd also like to add configuration for sieveBits, as the best value for this depends on the size of your L3 cache and the number of prime workers you're using.

I'll look at getting those things done plus the merge of your refactoring changes over the next day or two.
So I tested more formally and I can confirm the slight performance increase of a few % using your code, though for some reason the metrics seem to converge slower now. And even with the CPU usage drop when there is a new block, the performance remain noticeably better. I do not really care about this latter problem.

With this solution, for me, the CPU Underuse bug can be considered as solved in practice. In theory, it is still problematic, as for Difficulty 304, it is still present for any Sieve over 2^20 with 16 Threads, and I wonder if it will work properly for 32 Threads and more (I do not have any way to test in these cases). Maybe the threading logic will still need to be rewritten from scratch someday. But for now, it is fixed enough for me and I can finally close the issue.

Your propositions seem Ok and I will be glad to merge everything. Thank you a lot.
rieMiner - Riecoin solo + pooled miner
Personal Riecoin page (links, download,...)
freebitco.in - earn up to $200 in BTC each hour!

Rockhawk
Posts: 48
Joined: 29 Oct 2018, 21:12

Re: rieMiner - Solo + pooled Riecoin mining

Post by Rockhawk » 20 Nov 2018, 20:19

Rockhawk wrote:
20 Nov 2018, 12:11
Actually, the idea of storing the remainder results to be reused on further sieve iterations has given me an idea of how to better split the work between the mod and sieve jobs to avoid the use of the bucket lock, which currently costs some performance when starting processing on a new block.
I gave this a try but it didn't actually work out very well in practice.
Pttn wrote:
20 Nov 2018, 20:01
Your propositions seem Ok and I will be glad to merge everything. Thank you a lot.
Great, I'll look at finishing those things off next.

Pttn
Posts: 131
Joined: 24 Aug 2018, 13:37

Re: rieMiner - Solo + pooled Riecoin mining

Post by Pttn » 21 Nov 2018, 00:44

Rockhawk wrote:
10 Nov 2018, 15:49
Record for the quickest finding a block after the miner starts? I was just doing some testing and:
[0000:00:00] Started mining at block 984921 difficulty 1414
[0000:00:30] (1-3t/s) = (111.8 3.88 0.131) ; (2-6t) = (118 4 0 0 0)
Block timing: 10396219, 24270330, 253198398
[0000:01:00] (1-3t/s) = (113.1 4.42 0.132) ; (2-6t) = (267 8 0 0 0)
Block timing: 10053809, 23123397, 256614267
[0000:01:30] (1-3t/s) = (115.3 4.33 0.133) ; (2-6t) = (392 12 0 0 0)
[0000:01:33] 6-tuple found, this is a block!

Code: Select all

[0000:00:00] Started mining at block 990876, difficulty 1408                                                                                                                                                                             
Block timing: 37243181, 46338967, 289813610  Tests out: 1296, 0                                                                                                                                                                          
[0000:00:31] 4-tuple found                                                                                                                                                                                                               
Block timing: 34781288, 90908283, 404156820  Tests out: 0, 1536                                                                                                                                                                          
[0000:01:00] (1-3t/s) = (318.3 12.56 0.545) ; (2-6t) = (761 33 1 0 0) | 9.12 h                                                                                                                                                           
[0000:01:12] 6-tuple found, this is a block!                                                                                                                                                                                             
Sent: {"method": "submitblock", "params": ["000000201e8469ec5278c76f329bed5b426b70aa7dac3b5a9a108564322dedb3e5b24bf0e3a9d1d3378745dffbefc06a2ae717a948179818499ed23dcfefaaad31ab0dd20080050285a7f45b00000000b70133b6b7b60c07f7d6416cecb1dbbf5312713c558b54de3ab59430bba77e0d0101000000010000000000000000000000000000000000000000000000000000000000000000ffffffff10039d1e0f7269654d696e657239c0e351ffffffff0100f90295000000001976a914a0524c39408c405357010881226da2c6c467e72488ac00000000"], "id": 0}                                                                                                                                                                                                                       
Submission accepted :D !                                                                                                                                                                                                                 
[0000:01:12] Blockheight = 990877, average 72.8 s, difficulty = 1408                                                                                                                                                                     
Block timing: 35633056, 31125641, 193034233  Tests out: 0, 0                                                                                                                                                                             
Block timing: 34942651, 50875030, 340077887  Tests out: 0, 943                                                                                                                                                                           
[0000:01:50] 4-tuple found
It happens sometimes to find a block in less than 2 minutes :D ! It is not the first time for me!
A 4-tuple has at current difficulty about 1/625 chance to be a 6-tuple (assuming a ratio of ~25), then if you find a 4-tuple every minute, you have roughly this probability to find a block at the first minute...
If only I had such chance for the Superblock...
rieMiner - Riecoin solo + pooled miner
Personal Riecoin page (links, download,...)
freebitco.in - earn up to $200 in BTC each hour!

Rockhawk
Posts: 48
Joined: 29 Oct 2018, 21:12

Re: rieMiner - Solo + pooled Riecoin mining

Post by Rockhawk » 21 Nov 2018, 23:57

On the offsets, I realised there was a small optimization if the difference between the offsets was a constant, so I've found 2 groups of 4 offsets with constant difference that work. We could do a bigger search to get bigger groups but as 4 sieve workers is probably as many as anyone needs in practice for now this is likely good enough.

Overall, I think this branch is about ready now. I'll try and get the merge done tomorrow so I can submit a pull request.

Post Reply