Also speculating here, but I don't know of many folks training their own models on PB scale datasets either. That sort of training is ludicrously expensive and, outside of the startups getting fun money thrown in their direction, most folks are using off-the-shelf models for most of their heavy lifting.
The trend seems to be towards the deployment of cloud-based, pre-trained foundation models which are then fine-tuned by businesses using local datasets. Due to the latter, SMB/enterprise demand for storage is likely to grow more than usual. My personal take is that individual PC users only ever account for a small share of hardware demand.
Oh I was only talking about enterprise demand. I agree, consumers have and will always get the dregs of enterprise. Training an AI at home (at this point) is like running a Kubernetes cluster at home. Sure, it mimics the patterns of a production cluster but the scale is effectively a rounding error.
My point is the, even in the enterprise context, few companies are training their own models. And even if they are, they're rarely training from the ground up. More often than not, they're extending a base model. Storage development and therefore prices will continue to increase because we're spinning off more data as a society in a day than we can ever hope to capture, not because we're all training openai-scale models.
Maybe you can argue that we have an increased interest in capturing the more "mundane" data for training purposes, but so much of that data is noise that filtering the raw bits down into a useful signal will always bring us back to the real bottleneck: compute.
The only evidence I can really point to is that researchers were working on transformers for _years_ before this wave. We've only just hit a point where it's computationally feasible to scale the architecture and train the model.
I also acknowledge that, judging by your post history, we're likely on opposite sides of the bullish-bearish spectrum.