liori

Programming Now

Tool for instantiating a C++ template at runtime?

liori Now • 100%

Personally I think child processes are the right approach for this. Launch a new process* for each query and it can (if you choose to go that route) dynamically load in compiled code. Exit when you’re done, and the dynamically loaded code is gone. A side benefit of that is memory leaks are contained, since all memory you allocate is about to be removed anyway.

I'd probably be fine with hundreds or thousands of these hanging in memory. I suspect the generated code for a single query would be in hundreds of kilobytes, maybe a megabyte. But yeah, this is one of those technical details I'd worry about.

Honestly, I wonder if you could just use an actual HTTP server for this? They can handle hundreds or even thousands of simultaneous requests. They can handle requests that complete in a fraction of a millisecond or ones that run for several hours. And they have good tools to catch/deal with code that segfaults, hits an endless loop, attempts to allocate terabytes of swap, etc. HTTP also has wonderful tools to load balance across multiple servers if you do need to scale to massive numbers of requests.

Not sure how a HTTP server would solve the CPU bottleneck of scanning terabytes of data per query?

1

Programming Now

Jump

Tool for instantiating a C++ template at runtime?

liori Now • 100%

I somehow didn't think a regular JIT solution might be applicable here, but it is. Thank you! There seems to be a number of projects doing JIT for C++, will look at them.

2

Programming liori • Now • 100%

Tool for instantiating a C++ template at runtime?

I'm working on a query engine, essentially a tool to scan/filter/annotate by lookups/group by/aggregate a large dataset, tens-of-terabytes range. The compute part seems to be a bottleneck for me (I'll be doing around 80-300 GB/s of reads, and yes, I will have hardware capable of providing that kind of throughput). My hypothesis is that by encoding query in form of template arguments I can make the compiler generate code optimized for a specific type of query (like, the filtering or aggregation keys). But I do not know what queries will users send, so I need a way to instantiate templates at runtime. Sounds simple: for a new type of query invoke a compiler at runtime to build a dynamic library with a new instantiation, then dynload it and off we go. Some [prior work](https://www.researchgate.net/publication/309663655_Runtime_Template_Instantiation_for_C) is here, though I'm pretty sure any JIT compiler also can counts here. But there's enough technical details to worry about, and at the same time this idea isn't novel, so I wonder—are there any packaged solutions for this kind of approach?

19

7

Selfhosted Now

Jump

Please recommend your cheaper, reliable SSDs 2TB+ (4TB ideal)

liori Now • 100%

So far I've been following recommendations from this person: https://old.reddit.com/r/NewMaxx/comments/16xhbi5/ssd_guides_resources_ssd_help_post_your_questions/

12

boardgames Now

Jump

Custom Insert for Terraforming Mars

liori Now • 100%

Plenty of them on various sites, like this one I found yesterday.

3

Programming liori • Now • 94%

Encrypted traffic interception on Hetzner and Linode targeting the largest Russian XMPP (Jabber) messaging service

http://notes.valdikss.org.ru/jabber.ru-mitm/

> TL;DR: we have discovered XMPP (Jabber) instant messaging protocol encrypted TLS connection wiretapping (Man-in-the-Middle attack) of jabber.ru (aka xmpp.ru) service’s servers on Hetzner and Linode hosting providers in Germany. The attacker has issued several new TLS certificates using Let’s Encrypt service which were used to hijack encrypted STARTTLS connections on port 5222 using transparent MiTM proxy. The attack was discovered due to expiration of one of the MiTM certificates, which haven’t been reissued. There are no indications of the server breach or spoofing attacks on the network segment, quite the contrary: the traffic redirection has been configured on the hosting provider network. The wiretapping may have lasted for up to 6 months overall (90 days confirmed). We believe this is lawful interception Hetzner and Linode were forced to setup.

31

4

World News liori • Now • 99%

Poland exit polls: PiS party wins most votes but opposition coalition possible

www.theguardian.com

In short, by the exit polls: - PiS (the current ruling party, right-wing) got the most votes, but cannot rule alone, nor can rule in coalition with the alt-right party Konfederacja by a decent margin (212 mandates total vs. 231 needed to have majority). - The opposition parties (center-right Koalicja Obywatelska, center-right Trzecia Droga and leftist Lewica) together have majority. They know they have to make a government together, the question is whether they can overcome their differences. They did suggest strong coöperation during their campaigns. - Highest-ever turnout (72%) in the elections. - The accompanying referendum (a device to have more funds for promoting ideas by the current ruling party) a total failure (40% turnout—voters had to explicitly opt-out of participation!).

104

6

Programming liori • Now • 97%

C++ Design Patterns for Low-latency Applications Including High-frequency Trading

https://arxiv.org/abs/2309.04259v1

While high-frequency trading is not exactly my favourite topic, I do like reading on their technical approaches. By Paul Bilokon, Burak Gunduz > This work aims to bridge the existing knowledge gap in the optimisation of latency-critical code, specifically focusing on high-frequency trading (HFT) systems. The research culminates in three main contributions: the creation of a Low-Latency Programming Repository, the optimisation of a market-neutral statistical arbitrage pairs trading strategy, and the implementation of the Disruptor pattern in C++. The repository serves as a practical guide and is enriched with rigorous statistical benchmarking, while the trading strategy optimisation led to substantial improvements in speed and profitability. The Disruptor pattern showcased significant performance enhancement over traditional queuing methods. Evaluation metrics include speed, cache utilisation, and statistical significance, among others. Techniques like Cache Warming and Constexpr showed the most significant gains in latency reduction. Future directions involve expanding the repository, testing the optimised trading algorithm in a live trading environment, and integrating the Disruptor pattern with the trading algorithm for comprehensive system benchmarking. The work is oriented towards academics and industry practitioners seeking to improve performance in latency-sensitive applications.

35

0

datahoarder Now

Jump

Recovering Hardware RAID array in software

liori Now • 100%

Try dmraid, it's been designed to take over various formats of hardware RAID cards.

2

Programming Now

Jump

Linux file system developer: we're severely under-resourced

liori Now • 83%

Kernel is not a monolithic application, and you cannot develop it like one. There are tons of actors: independent developers, small support companies (like Collabora), corporations, all with different priorities. There is a large number of independent forks (e.g. for obscure devices), that will never be merged, but need to merge e.g. security patches from the mainline. A single project management tool won't do, not your typical business grade tracking&reporting tool.

CI is already there. Not a central one—again, distributed across different organizations. Different organizations have different needs for CI, e.g. supporting weird architectures that they need to develop against.

There is a reason Torvalds created git—existing tools just wouldn't work. There might be a place for a similar revolution regarding a bugtracker…

4

Programming Now

Jump

Linux file system developer: we're severely under-resourced

liori Now • 100%

This plea for help is specifically for non-coding, but still deeply technical work.

23

Programming Now

Jump

Linux file system developer: we're severely under-resourced

liori Now • 100%

The thread is an attempt to merge a new file system, bcachefs. This is a large change, requiring a lot of review from experienced developers, and getting anyone to do this work turned out to be difficult. Darrick here started talking how, in general, all development of file systems in Linux is troubled by lack of manpower.

2

Programming Now

Jump

Linux file system developer: we're severely under-resourced

liori Now • 100%

I guess the best start would be to have a person to organize volunteers.

8

Programming liori • Now • 98%

Linux file system developer: we're severely under-resourced

https://lore.kernel.org/lkml/20230810223942.GG11336@frogsfrogsfrogs/

>I've said this previously, and I'll say it again: we're severely under-resourced. Not just XFS, the whole fsdevel community. As a developer and later a maintainer, I've learnt the hard way that there is a very large amount of non-coding work is necessary to build a good filesystem. There's enough not-really-coding work for several people. Instead, we lean hard on maintainers to do all that work. That might've worked acceptably for the first 20 years, but it doesn't now. > > […] > > Dave and I are both burned out. I'm not sure Dave ever got past the 2017 burnout that lead to his resignation. Remarkably, he's still around. Is this (extended burnout) where I want to be in 2024? 2030? Hell no.

501

97

boardgames Now

Jump

How do you store your bigger games that don't fit your shelf?

liori Now • 100%

I'm pretty sure just like transport containers were standardized by ISO to make transport easier, game boxes should be standardized to fit in Kallax.

6

Programming Now

Jump

Intentionally corrupting LLM training data?

liori Now • 100%

Another idea that just occurred to me. Maybe position: absolute; both the real content and the gibberish content with the same top, left, width, and height attributes so that the real content and the gibberish overlap and occupy the same location on the page. Make sure both the real and gibberish content elements have no background so that remains clear. Put the gibberish content in the DOM before the real content. (I think that will ensure that the gibberish appears behind the real content even without setting the z-index.) And then make JS set the color of the text in the gibberish element the same color as the background so humans can’t see it.

Be aware that these techniques can affect accessibility for people using screen readers.

5

World News Now

Jump

Refugees overqualified and underpaid in Germany

liori Now • 100%

As of May 2023, 65% of the Ukrainian refugees that left Ukraine starting February 2022 and decided to stay in Poland found a job—so, within around a year, as opposed to 5-6 years as in the article. Cultural similarity here is likely making it much, much simpler. For those who want to read more about the situation of Ukrainian refugees in Poland, this report by Polish National Bank (Narodowy Bank Polski, NBP) might be useful: https://nbp.pl/wp-content/uploads/2023/05/Raport_Imigranci_EN.pdf (in English!), there is a lot of interesting details.

6

Lemmy Support Now

Jump

Can an admin please purge my alt account on this instance so that it doesn't appear on Google with Lemmy NSFW?

liori Now • 100%

lemmy.ml is hosted in EU, and lemmynsfw.com uses CloudFlare, which operates in EU. Worst case, issue a GDPR request to both.

3

Programming Now

Jump

Asynchronous cross-organizational APIs—any available tooling?

liori Now • 100%

Yep, thank you, that's pretty close to what I imagined!

5

Programming liori • Now • 92%

Asynchronous cross-organizational APIs—any available tooling?

We are working on a tool that essentially allows external customers to access various extracts of our datasets, with parameterized filtering, aggregation, the usual stuff, though REST API. Some of these extracts are time consuming to prepare, so we are looking for ways to manage asynchronous report generation or making it possible for customers to schedule reports upfront, as opposed to having a synchronous API. There are tons of libraries for implementing synchronous REST APIs, but are there any standard approaches or tools for this kind of asynchronous cross-organizational communication? Like, maybe something that would allow each customer inspect their schedules and pending queries, configure how they want the results to be delivered? I fear we will need to build something like that from scratch.

21

2

World News Now

Jump

Osaka Expo asks for overtime cap exemption as time pressure mounts

liori Now • 100%

A lack of planning on your part doesn’t constitute an emergency on mine.

Though I kind of think Japanese grammar cannot express this thought and the closest you can get is Ganbatte!

4

World News Now

Jump

More than 1000 fires burning in Canada right now

liori Now • 100%

Good question! I quickly found this table, though this is yearly statistics only: https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=3510019201

11

datahoarder Now

Jump

Cloud Backup

liori Now • 100%

Yep, it's EU. File transfer shouldn't be bad if your files are large, though it's best if you tested it first—it might depend on your ISP's peering and your prefered transfer protocols/tooling. Whether it's reputable for your purpose, you probably have to do your own research. Also, remember that the offer I mentioned would only be equivalent in durability to a single-box RAID5 for your purposes, so not exactly equivalent to Google's.

1

datahoarder Now

Jump

Cloud Backup

liori Now • 100%

There's Jottacloud with unlimited storage for 10 EUR/month, but they gradually slow down after first 5 TB. 30 TB might be a bit too much. There's Hetzner with their dedicated 4×10TB machines for ~52 EUR, you could do RAID5 and have somewhat redundant 30 TB, at the cost of self-managing a dedicated machine. There are several providers doing regular S3 (which you can take advantage of with tools like rclone) with decent redundancy for 4-5 USD/TB + egress. For high-value data you should be probably spending more than 100 USD/month for 30TB in the cloud, or invest in actual hardware. Do you need hot access to this dataset, or is a cold storage archive enough?

2

Free and Open Source Software Now

Jump

Thunderbird releases 115 'Supernova', with an overhaul to their design and functionality

liori Now • 100%

Will they keep the dense email list view as an option? Seeing more than the 14 email messages visible on the screenshot in the post is useful to sort out large folders.

1

Lemmy Support Now

Jump

Changing domain name of lemmy server

liori Now • 100%

I'm surprised federation isn't based on asymmetric cryptography. Let the public/private keys identify instances, as opposed to domains that risk being blocked by governments or bought by malicious third parties if the instance owner forgets to prolong it.

With that, implementing a change in domain names would be simple.

1

Programming liori • Now • 100%

In-depth explanation of how hard disk drives work

https://web.archive.org/web/20190121065908/http://pcguide.com/ref/hdd/index.htm

Some time ago I was looking for a sources that would give me in-depth understanding of performance characteristics of large-scale storage. This is the best text on hard disk drives I've found so far, explaining details such as various switch times, zoned recording or head skew. It's almost 20 years old, though, and so misses some developments. I wonder if you know any more modern sources?

36

1