rieMiner - Solo + pooled Riecoin mining

Riecoin mining software & pools
Post Reply
czakris
Posts: 20
Joined: 16 Sep 2018, 05:03

Re: rieMiner - Solo + pooled Riecoin mining

Post by czakris » 30 Oct 2018, 19:10

Report from last 15 hours (orphant blocks and missing payments):
978669, 978636, 978626, 978611, 978606, 978595, 978589, 978582, 978578, 978572, 978555, 978550, 978543, 978534, 978500, (2018-10-30 11:40:28 AddToWallet) 978501, 978489, 978487, 978473, (2018-10-30 10:14:39 AddToWallet) 978474, 978469, 978465, 978456, 978452, 978446, 978438, 978419, 978412, 978405, 978401, 978398, 978376, 978372, 978368, 978364, 978364,
I had these nodes:
addnode=5.9.39.9
addnode=37.59.143.10
addnode=144.217.15.39
addnode=149.14.200.26
addnode=178.251.25.240
addnode=193.70.33.8
addnode=195.138.71.80
addnode=198.251.84.221
addnode=199.126.33.5
addnode=217.182.76.201
And I added few more:
addnode=45.32.147.222
addnode=107.191.63.155
addnode=45.32.129.71
addnode=45.76.34.20
addnode=45.63.36.126
addnode=45.32.156.59
addnode=45.63.115.161
addnode=45.63.19.206
addnode=45.32.74.83
addnode=45.63.57.92
addnode=120.27.99.210
addnode=192.241.187.43
addnode=192.95.24.114
addnode=207.172.128.231
addnode=5.9.39.9
addnode=85.70.38.244
addnode=88.88.21.16
addnode=108.61.123.145
Now I have 2 active connections to Riecoin and will wait for a few days to check it this will fix it.
Last edited by czakris on 30 Oct 2018, 22:09, edited 2 times in total.

czakris
Posts: 20
Joined: 16 Sep 2018, 05:03

Re: rieMiner - Solo + pooled Riecoin mining

Post by czakris » 30 Oct 2018, 22:01

[0001:51:44] 6-tuple found, this is a block!
Sent: {"method": "submitblock", "params": ["020000000a4f130f247f04d4afafbdfc9aaa
d3e9de822cbfb09cb27ee0af434619f20f699486366f74fe7bf34e9c1491062f89a4436de8ed4be5
eaf1ba05bb58e2069c01007c0502bfcbd85b0000000097b9c466e2fea97c19883465e9635ab0a97c
52712466d5834c2fce5ce8f3f4700101000000010000000000000000000000000000000000000000
000000000000000000000000ffffffff10033aef0e7269654d696e6572a563f542ffffffff0100f9
0295000000001976a9147f400eb11873957573b22166494417ef18eff48c88ac00000000"], "id"
: 0}
Submission accepted :D !
[0001:51:44] Blockheight = 978746, average 124.1 s, difficulty = 1404
You was right Pttn, now I have 7 connections, not many but at least getting some rewards for blocks.

Rockhawk
Posts: 48
Joined: 29 Oct 2018, 21:12

Re: rieMiner - Solo + pooled Riecoin mining

Post by Rockhawk » 31 Oct 2018, 22:13

Pttn wrote:
30 Oct 2018, 02:53
Welcome here Rockhawk :D !

Interesting. I am looking forward to seeing these improvements in a rieMiner fork and compare the performance. There is a Benchmark Mode in rieMiner which will make such testing easy. If you solved the CPU underuse problem, and even speed up the mining significantly, I will be glad to donate a few thousands of RIC if your code is released in MIT licence and if I like it. However, it is a shame that you need to mod GMP.

Would the speed up be just a few %, or more like 20-30%?
The benchmark mode is great - it really helps to have a stable environment to test in!

I have ported over the first chunk of my optimizations. This is everything that doesn't start bringing in customized parts of GMP and x64 assembler.

The improvements are to the sieving and modulo calculations, so give more benefit with a higher prime limit. You get around a 5% speed up with the default prime limit of 2^30, and nearly 10% with a prime limit of 3*2^30. Note that the current limit is 2^32 (the configuration allows higher but there's a multiply that overflows, I've added an assert against it where the problem is).

Questions:
- There's a tension between being flexible about the tuple definition and performance. I have been "semi flexible" and sort of supported dynamic tuple offsets, but only when the differences are 2 and 4 (though it would be straightforward to add more differences if that was useful). I suspect hard coding array lengths would help the optimizer (though I haven't tested it). What's the goal of allowing different tuple definitions?
- Are we happy to drop support for x64 platforms? These changes target x64 with SSE2 or later only, but for now it would be easy to add the old code in as a fallback. However, this will get trickier as more assembler is added.

I've submitted a pull request.

Pttn
Posts: 131
Joined: 24 Aug 2018, 13:37

Re: rieMiner - Solo + pooled Riecoin mining

Post by Pttn » 01 Nov 2018, 11:56

Rockhawk wrote:
31 Oct 2018, 22:13
The benchmark mode is great - it really helps to have a stable environment to test in!

I have ported over the first chunk of my optimizations. This is everything that doesn't start bringing in customized parts of GMP and x64 assembler.

The improvements are to the sieving and modulo calculations, so give more benefit with a higher prime limit. You get around a 5% speed up with the default prime limit of 2^30, and nearly 10% with a prime limit of 3*2^30. Note that the current limit is 2^32 (the configuration allows higher but there's a multiply that overflows, I've added an assert against it where the problem is).

Questions:
- There's a tension between being flexible about the tuple definition and performance. I have been "semi flexible" and sort of supported dynamic tuple offsets, but only when the differences are 2 and 4 (though it would be straightforward to add more differences if that was useful). I suspect hard coding array lengths would help the optimizer (though I haven't tested it). What's the goal of allowing different tuple definitions?
- Are we happy to drop support for x64 platforms? These changes target x64 with SSE2 or later only, but for now it would be easy to add the old code in as a fallback. However, this will get trickier as more assembler is added.

I've submitted a pull request.
Thank you for your contribution. What CPU are you using for benchmarking?

You mean dropping x86 support (and not x64)? As nobody is going to mine on a 32 bits CPU + OS now, it is fine. I also think that it is reasonable to drop other architectures (nobody is going to mine in a ARM as well). Else, someone can always do a fork to support older CPUs or other architectures, or just use rieMiner up to 0.9β2.3.

As AVX exists since Sandy Bridge (2011), it is here since a very long time, so if you use it to improve even more the performance, it will be Ok for me (it does not make much sense to mine in older than the 2600K). But if optimizations involve modding a library, I am against including these, and it would be better to keep them in your fork.

Allowing different tuple definitions can be useful if we decide to hard fork to change the Riecoin's proof of work. There were some discussion about this subject, in particular because mathematically, it is impossible that a 6-tuple is a 7-tuples or more, which is a shame. So, rieMiner should be ready if the day we choose to change the tuple size ever comes. And maybe some people might be interested to use rieMiner not for mining, but just to find big prime constellations of any size for themselves. But, as this will likely not happen in a very long time, if fixing the size can give significant improvements (more than 5%), we can do it.

Unfortunately, the CPU underuse bug is still present, although a bit less important. However, it looks like that I got a massive ~40% increase in my 2700X @3.4 GHz using 2 (instances) x 8 threads and a sieve of 2^32 (for actual mining at difficulty 1395)! Though, the CPU underuse was significant before (even with these 2 instances) and negligible now.

I will do some further benchmarking and code review later. If these are really convincing and you allow your code to be released in MIT licence, I will merge your pull request. It is also Ok if you choose GPL, as the miner part code is already in this license, but personally, more permissive is better, especially in something related to math research. And even if the CPU underuse bug is still present, it is already worth to merge a significant performance enhancement.

Would you be able to fix completely the CPU underuse bug before I release a new 0.9β3? I really wish to get this fixed before releasing a stable version. In particular, mining with 8 threads or more in Testnet should properly use the CPU at 100% for any Sieve up to 2^32. It would also be nice if you could fix the multiplication overflow to allow sieve higher than 2^32 (if possible).
rieMiner - Riecoin solo + pooled miner
Personal Riecoin page (links, download,...)
freebitco.in - earn up to $200 in BTC each hour!

Rockhawk
Posts: 48
Joined: 29 Oct 2018, 21:12

Re: rieMiner - Solo + pooled Riecoin mining

Post by Rockhawk » 01 Nov 2018, 14:49

Pttn wrote:
01 Nov 2018, 11:56
Thank you for your contribution. What CPU are you using for benchmarking?
Quite old - an i7-3770 (Ivy Bridge).
Pttn wrote:
01 Nov 2018, 11:56
You mean dropping x86 support (and not x64)? As nobody is going to mine on a 32 bits CPU + OS now, it is fine. I also think that it is reasonable to drop other architectures (nobody is going to mine in a ARM as well). Else, someone can always do a fork to support older CPUs or other architectures, or just use rieMiner up to 0.9β2.3.
Sorry, I meant dropping support for anything other than x64. Think ARM64 is the only thing likely to come up, if anyone ever starts producing high performance ARM chips. But I don't think that's worth worrying about for now, so we'll say x64 only for new versions.
Pttn wrote:
01 Nov 2018, 11:56
As AVX exists since Sandy Bridge (2011), it is here since a very long time, so if you use it to improve even more the performance, it will be Ok for me (it does not make much sense to mine in older than the 2600K). But if optimizations involve modding a library, I am against including these, and it would be better to keep them in your fork.
The way I will do the next optimizations is to copy some code from the GMP library and then modify it, rather than patching the library itself. If you're willing to accept new GPL code then this should be fine. I'll probably look at this next week.
Pttn wrote:
01 Nov 2018, 11:56
Allowing different tuple definitions can be useful if we decide to hard fork to change the Riecoin's proof of work. There were some discussion about this subject, in particular because mathematically, it is impossible that a 6-tuple is a 7-tuples or more, which is a shame. So, rieMiner should be ready if the day we choose to change the tuple size ever comes. And maybe some people might be interested to use rieMiner not for mining, but just to find big prime constellations of any size for themselves. But, as this will likely not happen in a very long time, if fixing the size can give significant improvements (more than 5%), we can do it.
OK, I think we could support a max offset of 6 (instead of 4), which would cover all patterns up to 12-tuplets, without a significant loss of performance. We could maybe compile that out in release mode until it is required. As an aside, it would be great to see a wider variety of k-tuplets being used - maybe we could break some records!
Pttn wrote:
01 Nov 2018, 11:56
Unfortunately, the CPU underuse bug is still present, although a bit less important. However, it looks like that I got a massive ~40% increase in my 2700X @3.4 GHz using 2 (instances) x 8 threads and a sieve of 2^32 (for actual mining at difficulty 1395)! Though, the CPU underuse was significant before (even with these 2 instances) and negligible now.
I haven't done anything to directly address this yet, though it makes sense that increasing the performance of the calculations that happen before primality testing starts would help. Note that in "real" mining the performance benefits are slightly higher than in the benchmark because they apply to work that is always done before primality testing starts.
Pttn wrote:
01 Nov 2018, 11:56
I will do some further benchmarking and code review later. If these are really convincing and you allow your code to be released in MIT licence, I will merge your pull request. It is also Ok if you choose GPL, as the miner part code is already in this license, but personally, more permissive is better, especially in something related to math research. And even if the CPU underuse bug is still present, it is already worth to merge a significant performance enhancement.

Would you be able to fix completely the CPU underuse bug before I release a new 0.9β3? I really wish to get this fixed before releasing a stable version. In particular, mining with 8 threads or more in Testnet should properly use the CPU at 100% for any Sieve up to 2^32. It would also be nice if you could fix the multiplication overflow to allow sieve higher than 2^32 (if possible).
I'm happy to release these changes under the MIT licence. As I said above some later changes will need to be GPL as they will be based on GMP.

I'm not sure when you are thinking for a new release, I'm unlikely to make significant improvements on the CPU underuse issue for a couple of weeks, as I'd like to port the changes that benefit my mining first (I don't have access to a box with >8 threads, although I can spin up an EC2 instance or something for testing).

Pttn
Posts: 131
Joined: 24 Aug 2018, 13:37

Re: rieMiner - Solo + pooled Riecoin mining

Post by Pttn » 01 Nov 2018, 19:30

Ok, it is fine if you copy some modded GMP code and in such situation, GPL is perfectly fine. The main aim is to avoid to require people compiling themselves the modded library. In this case, I would like that you put this code separate, in a folder "External". And Ok, supporting tuples length up to 12 is already good.

So, I did the benchmarks, and there is indeed an noticeable improvement that cannot be attributed to statistical error. In the Standard Benchmark settings, it is however very small, about 2-3% if we compare the 2-tuples/s speeds.

Comparison (2 or 1 to 4 tuples/s) for 2700X@4 GHz and 6700K@3 GHz DDR4 2400. Sieve of 2^30.
3 and 4-tuples/s metrics should be discarded.

- (7.209693 0.246831 0.007899) to (221.429310 7.376304 0.257705 0.004639)
- (2.766217 0.091849 0.003031) to (84.502101 2.847050 0.092587 0.002500)

I get this with a Sieve of 2^31 on the 2700X@4 GHz: (216.912443 7.568411 0.254400 0.008904)
Comparing with the previous code here is a nonsense because the CPU already suffers from the CPU underuse this Sieve. But it seems that here, we enjoy a better improvement.

But, the improvements seem greater during actual mining (to confirm later).

I would love that you fix the CPU underuse problem, there must be people with ThreadRippers, Epycs or Xeons that would highly benefit from having this fixed. And myself too, as I have to launch 2 instances and waste GB of RAM to mine properly with the 2700X. If you are interested in fixing it, you can mine on Testnet/Bechmark with Diff 304 with almost any Sieve, and even your 3770K would suffer from the CPU underuse bug. Then, you might be able to investigate where the problem could be and fix it. But do not feel obliged/pressure to fix this. It is also already great that you provide performance enhancements :D !

I was able to compile on Linux and Windows x64 without any issue.

Now, I would like to merge your pull request, but unfortunately, there is one critical flaw: I cannot do the Easy Benchmark with your code: doing the Benchmark with Difficulty 800 and any Sieve below ~309M is not working. Also, for example, it is not working with Difficulty 304 (same as Testnet) below 115M. However, I could set a Sieve at 2^20 = ~1M without any problem for Difficulty 304 before. Thresholds apply for 8 or 16 threads using both 2700X or 6700K, and also while using Debian 9 or a Windows 10 VM. In these conditions, it just does nothing and show 0 for all metrics.
Please fix this and test different difficulties from 304 to 3000 with different SIeves from 1M to 4G and make sure that the miner works for all these parameters. It was the case previously, even Sieve 1M was working for Difficulty 3000.
Or was this intended? I am Ok to require a minimum Sieve, but you must ensure that the miner will work properly.

There is no set date for β3, I bump the version when I feel that there were enough changes. In this case, I will bump to β3 if you finish integrating the optimizations into rieMiner, and ensure that they did not introduce bugs like the one in the previous paragraph; when I will merge them. Until then, merging progressively your code will just increase the version to β2.4, β2.5,... If you fix the CPU underuse soon, I can include the fix in β3, else it will just be released before. For the stable 0.9, I am not going to release it until we decide that the 0.16.3 is the new official wallet, but it would be really great if at least until then, the CPU underuse bug gets fixed (there is plenty of time).
rieMiner - Riecoin solo + pooled miner
Personal Riecoin page (links, download,...)
freebitco.in - earn up to $200 in BTC each hour!

Rockhawk
Posts: 48
Joined: 29 Oct 2018, 21:12

Re: rieMiner - Solo + pooled Riecoin mining

Post by Rockhawk » 01 Nov 2018, 22:53

I haven't had much time tonight but I've quickly fixed that problem with the small sieve sizes and updated the pull request.

It might be useful to have an automated test that ran maybe 10 or so different configurations for a few minutes each, testing extremes of supported sieve ranges, difficulties, and maybe a larger tuple configuration. Obviously the actual performance would differ between machines, but just looking at the 1 and 2 tuple results should help spot any errors like this one, as well as obvious performance regressions.

I'm away the next few days, but next week I'll get the fix done for sieving 7-12 tuples, and continue bringing optimizations across.

Pttn
Posts: 131
Joined: 24 Aug 2018, 13:37

Re: rieMiner - Solo + pooled Riecoin mining

Post by Pttn » 02 Nov 2018, 04:12

Great, I merged your code. Now, we are at 0.9β2.4. Thank you, I am looking forward to enjoying the next improvements. It is also great to have new developers for Riecoin!

I take note regarding writing testing code, though this is not really a priority for me, as the code is not that complex and manually testing is fine. I might do this later, around the stable release.
rieMiner - Riecoin solo + pooled miner
Personal Riecoin page (links, download,...)
freebitco.in - earn up to $200 in BTC each hour!

czakris
Posts: 20
Joined: 16 Sep 2018, 05:03

Re: rieMiner - Solo + pooled Riecoin mining

Post by czakris » 05 Nov 2018, 20:26

I am using 0.9β2.4 and compiled it with mingw64 but it seems that I am unable to set Sieve max value higher than 4294967296 (2^32). Previous versions works fine with 17179869184 (2^34) sieve value without any issue. How to fix it?

Rockhawk
Posts: 48
Joined: 29 Oct 2018, 21:12

Re: rieMiner - Solo + pooled Riecoin mining

Post by Rockhawk » 05 Nov 2018, 21:41

czakris wrote:
05 Nov 2018, 20:26
I am using 0.9β2.4 and compiled it with mingw64 but it seems that I am unable to set Sieve max value higher than 4294967296 (2^32). Previous versions works fine with 17179869184 (2^34) sieve value without any issue. How to fix it?
Previous versions allowed you to set the sieve limit higher than 2^32 but it was bugged (there was a multiplication overflow), so I added an assert to protect against sieve size > 2^32 for now. I intend to fix it properly, hopefully this week.

Post Reply