The developers of the Manjaro Linux distribution, built on the basis of Arch Linux and aimed at beginners, announced the beginning of testing a new service MDD (Manjaro Data Donor), designed to collect statistics about the system and send it to the external server of the project. The author of the MDD intended to enable telemetry by default (opt-out), but the decision has not yet been approved and, judging by the objections of some developers and users, it is likely that telemetry will be offered as an option requiring prior consent of the user (a request to enable telemetry is proposed to be added to the greeting interface after the first download).

The report includes data such as host name, kernel version, desktop component versions, detailed information about hardware and drivers involved, screen size and resolution information, network device MAC addresses, disk serial numbers, disk partition data, information about the number of running processes and installed packages, versions of basic packages such as systemd, gcc, bash and PipeWire.

The sent data is stored on the project server in the ClickHouse database and visualized using the Grafana platform. The IP addresses of users are not stored, and the hash from the /etc/machine-id file is used as the system identifier.

Аccording to the code https://github.com/manjaro/mdd/blob/master/mdd.py#L40 sends everything.

  • Buffalox@lemmy.world
    link
    fedilink
    arrow-up
    2
    arrow-down
    5
    ·
    17 days ago

    The MAC address is anonymized with sha256, and IP adresses aren’t stored.
    So this seems to me to be perfectly anonymous.

    • GolfNovemberUniform@lemmy.ml
      link
      fedilink
      arrow-up
      21
      ·
      17 days ago

      Why collect such data though? And you can call some Big Tech telemetry completely anonymous too if you trust their explanations.

      • Buffalox@lemmy.world
        link
        fedilink
        arrow-up
        5
        arrow-down
        2
        ·
        17 days ago

        You can see the code of what is send.
        I’m not aware that Google claims they collect data anonymously, on everything where you are logged in.
        So that’s a false equivalence.

        • GolfNovemberUniform@lemmy.ml
          link
          fedilink
          arrow-up
          1
          arrow-down
          1
          ·
          17 days ago

          I’m not aware that Google claims they collect data anonymously, on everything where you are logged in.

          I meant other companies but ok.

    • gnuhaut@lemmy.ml
      link
      fedilink
      arrow-up
      15
      ·
      edit-2
      17 days ago

      MAC addresses are 48 bit, and half of that is just the manufacturer. So 24 bits really, and those bits aren’t random, I think manufacturers just assign these based on some scheme, like a serial number. Point is you could easily reverse the SHA by brute force.

      You can’t calculate any useful statistic from a hash so literally the only use this would have is some sort of tracking.


      Edit: I just looked up some data and I found someone using hashcat on an RTX 3090, which looks like it can do almost 10000 million SHA256 hashes per second of salted passwords (which are longer than 48 bit MACs, so MACs should be faster). 2²⁴ is 16.8 million, so it’ll take about 1.7 ms per vendor. I found a database with (all?) 53011 vendor ids:

      >>> 2**24 * 53011 / 10000 / 1000 / 1000
      88.93769973759998
      

      Yup, 89 seconds. You can calculate the SHA256 of every single MAC ever potentially issued in 89 seconds on a bog-standard 3090.

      • Buffalox@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        17 days ago

        this would have is some sort of tracking.

        It’s right at the top of the announcement, that it’s mainly for more accurate stats on unique users.
        It’s not that I think this is a good idea, because I don’t, but some people are blowing it out of proportions. Especially since this isn’t at all decided. Which I seriously doubt it will.

        • gnuhaut@lemmy.ml
          link
          fedilink
          arrow-up
          10
          ·
          edit-2
          17 days ago

          You don’t need this to count unique users. You could just assign a random number on install or whatever. Or even more simply, just run the thing once per month, should be accurate enough. Do they expect the software to just randomly spam duplicate reports? Don’t write it that way.

          Best case they don’t care about collecting minimal data and don’t understand that hashed MACs are easily reversible. So incompetent fools with no sensitivity to privacy.

          Maybe this should be Manjaro’s tagline: Not purposely malicious, just grossly negligent and ignorant.

          • Buffalox@lemmy.world
            link
            fedilink
            arrow-up
            5
            ·
            17 days ago

            You could just assign a random number on install or whatever.

            Funny, I thought the exact same thing.