• JasonDJ@lemmy.zip
      link
      fedilink
      English
      arrow-up
      51
      ·
      4 days ago

      That’s the idea. It’s pretty worthless for home use, but for AI workloads, it might make sense, the problem is that it’s not quite scalable yet.

      Essentially, if you’ve got 256Tb/s going over 200km of fiber, that means that there’s quite literally 32,000,000,000 bytes (32GB) “in flight”, living on the fiber at any period of time.

      So it’s essentially it’s a revolving sushi belt of bytes, roughly as large as London (inside M25), moving at nearly the speed of light.

      Of course, it doesn’t have to be the size of London. You could wind it into something about the size of a softball. Theoretically.

      It’s a cool idea and Carmack is no doubt a brilliant man. It seems far fetched but it’s kind of been done before… https://en.wikipedia.org/wiki/Core_rope_memory

      • Morphit @feddit.uk
        link
        fedilink
        English
        arrow-up
        8
        ·
        4 days ago

        It’s an optical delay-line memory. Early computer memories were acoustic in some manner.

        I can’t imagine that the latency of ‘delay line RAM’ would be acceptable to anyone today. Maybe there’s some clever multiplexing that could improve that but it would surely add more complexity that just making more RAM ICs.

        • tal@lemmy.todayOP
          link
          fedilink
          English
          arrow-up
          2
          ·
          edit-2
          4 days ago

          Neural net computation has predictable access patterns, so instead of using the thing as a random access memory with latency incurred by waiting for the bit you want to get around to you, I expect that you can load the memory appropriately such that you always have the appropriate bit showing up at the time you need it. I’d guess that it probably needs something like the ability to buffer a small amount of data to get and keep multiple fiber coils in synch due to thermal expansion.

          The Hacker’s Jargon File has an anecdote about doing something akin to that with drum memory, “The Story of Mel”.

          http://www.catb.org/~esr/jargon/html/story-of-mel.html

  • ms.lane@lemmy.world
    link
    fedilink
    English
    arrow-up
    8
    ·
    4 days ago

    It’s an interesting idea, but what’s the floor size for a pair of 200TB/s fibre transceivers vs. 32GB of HBM?

    It’s it’s not significantly less, this doesn’t seem like it’d be particularly helpful outside the 200TB/s of streaming data.

  • brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    4 days ago

    The issue with AI is “now”

    Can they power with solar? Nuclear? Hell, even a natural gas plant? Nope, the data centers need the power right this second, so they get gas turbines on site. Same with cooling; evaporative is just the quickest and cheapest to set up.

    Same with its architecture. There’s no time to fix temperature/sampling issues, no time to try bitnet or any of a bazillion interesting papers that came out. A shippable product (model) is needed yesterday; just scale up what we have. “Fail” a single experiment? Your team is fired, which is exactly what happened at Meta.

    Everything has to happen right now because of corporate FOMO. So, while this is an interesting musing and maybe Intel or someone will play with it, the actual AI labs could not care less because they can’t get it immediately.

  • tal@lemmy.todayOP
    link
    fedilink
    English
    arrow-up
    7
    ·
    4 days ago

    Note that this is from last month, though I haven’t seen it submitted.

      • nova_ad_vitum@lemmy.ca
        link
        fedilink
        English
        arrow-up
        10
        ·
        4 days ago

        The lack of investment in more production capacity for RAM is based on a roughly 3-year horizon for this insane extra AI demand.

        Creating workable consumer-grade alternatives with delay line memory of all things would take longer than that, and the market would collapse the moment AI demand for RAM dried up. This is one of those things that is theoretically possible but due to both technology and market conditions will absolutely not be a thing.

        • tal@lemmy.todayOP
          link
          fedilink
          English
          arrow-up
          15
          ·
          4 days ago

          Creating workable consumer-grade alternatives

          I think that this is intended not to replace DIMMs in PCs, but to replace HBM for AI use. If you’re doing neural net computation, you have very predictable access patterns, so you can store your edge weights such that the desired data is showing up at just the right time.

  • eleitl@lemmy.zip
    link
    fedilink
    English
    arrow-up
    3
    arrow-down
    1
    ·
    4 days ago

    They never mention the word latency even once. It’s a delay line SAM and speed of light in glass is some 200000 km/s. This is hard drive latency.

  • geekwithsoul@piefed.social
    link
    fedilink
    English
    arrow-up
    2
    ·
    4 days ago

    I don’t pretend to understand how this would actually work, but wouldn’t this essentially be like token ring networking but used as memory?

    • tal@lemmy.todayOP
      link
      fedilink
      English
      arrow-up
      5
      ·
      edit-2
      4 days ago

      A little bit, but normally Token Ring didn’t just keep data running around in a circle on and on — Token Ring works more like a roundabout, where you enter at a given computer on the ring and then exit at another device. Without looking, I suspect that, like Internet Protocol packets, Token Ring probably had a TTL (time-to-live) field in its frames to keep a mis-addressed packet from forever running around in circles.

      Also, I’m assuming that an implementation of Carmack’s idea would have only one…I don’t know the right term, might be “repeater”. You need to have some device to receive the data and then retransmit them to keep the signal strong and from spreading out. You wouldn’t want to have a ton of those, because otherwise it’d add cost. On Token Ring, you’d have a bunch of transceivers, to have a bunch of “exits”, since the whole point is to move data from one device to another.