About Me

Just call me Wolf. I have gone by Wolf, Wolf0, Wolf9466, and OhGodAPet, although I consider Wolf9466 or just Wolf as my current. I’m… well, anywhere I can set up computer shit, I guess - although I have traveled out of the country and such.

I began when I was 13 or so, starting with C++. I remember I didn’t like it, got a book on C, and found it less of a hassle to learn. I have always wanted to make low-level stuff, so I wanted to write an OS.

My next large project was an app with a Win32 GUI - its purpose was to manage updating and reproducing collections of art off a couple of sites.

However, the main site I wanted to target had a few problems:

The main site I wanted to target did not have an API (it still doesn’t)
At that stage in life, I was even more against external depedencies

This means, I crafted my own HTTP requests in C, and parsed the HTML in C to extract links to full-size images, and then to fetch the images themselves. It could search by artist, and filter by rating (safe, mature, adult), source of the image (the artist’s gallery, scraps, or favorites) and more. I put a hell of a lot of work into it, and later expanded it to support e621 (which thankfully does have an API.)

Following that, I began doing miner development in 2014. It began as me finding a coin that was new at the time, XMR. The public CPU miner was horribly slow. It was badly done, and since I was good at low-level and optimizations, I just had to fix it, like OCD. You can still see some of that from 2014 here, I wasn’t even aware at the time that what I was able to do was valuable… well, not until someone paid me to stop.

Current (possible?) project

So… there’s this new FPGA miner which is being produced, the Osprey E300, and I have good reason to believe that quite a sizable amount will be made and sold. The current firmware has very little to offer in terms of easy management capabilities, or detailed statistics reporting.

It does have a WebUI, but it is rather barren. The few tunable options which are available (such as limited control of clock speeds and voltages) are only offered via miner programs - and those options are FAR from complete, compared to what is possible on this machine.

Previous project

Xilinx are underclocking the HBM2 (forcibly - the user is unable to change it as their encrypted HBM IP will not generate settings for its internal PLL to run the memory above 900Mhz. The part is Samsung Aquabolt HBM2, and it’s actually specified by Samsung for 1000Mhz - 1200Mhz (depending on binning).

Many researchers, companies, and others benefit from the performance of these parts. They also hide the details of the configuration ports for their hard memory controllers and PHYs…

So I’m working on reverse engineering over a hundred registers, and changing the clock speed, but it’s not that simple. You see… one does not simply change PLL settings. You must to put the MCs in reset, put the PHYs in reset, then change with whatever you want, but then you gotta bring up the PHYs, make sure they don’t spew garbage during init, then release MCs from reset, then calibrate them… In short, you have to bring the entire memory system back up. By yourself.

I recieved an official reply from Xilinx, and in my opinion, their answer was quite good - they do admit I was right about the HBM parts - they are specified for 1000Mhz. However, due to the reliability guidelines stated in the datasheet (which seem to confirm the idea that it’s the FPGA fabric is the limiting factor - speedgrade -1 devices have a max of 800Mhz, while the faster -2 and -3 speedgrades have a max of 900Mhz), there’s a real chance that it won’t be able to maintain those data rates when process variations are considered, and having to meet power rail requirements, as well as reference clock quality guidelines.

Further, there is no way for the user to make those changes, because it would force them to take returns/RMAs on parts which meet the expectations laid out by the datasheet, but would fail at higher HBM data rates. It’s also a support issue - allowing users to exceed the max speed of the supported use case could cause failures for end customers, and that would put Xilinx in a bad light. Their official answer can be found here.