If I made any mistake in my assumptions or method please correct me.
I was going to leave this to someone else, but I might as well put my thoughts in before some mistaken conclusions are made.
Now for debouncing. I had always assumed that debouncing only affects the actuation of a single key. Only that would make sense because debounce serves solely to prevent the creation of multiple messages from a single switch being actuated multiple times during a very short time frame (caused by the mechanical nature of switches). There is no reason to debounce the whole matrix.
With showing that the signaling rate does not cap out at <200Hz, I feel validated that debouncing is not a global phenomenon. If keystrokes can be registered only 1 millisecond apart from each other, the minimum of 5ms debounce can not apply to two different keys.
Debouncing still occurs even if multiple keys are pressed. How do I know that? well, based on your test, if debouncing only occurred for one key, and you pressed two: the following would occur: For the first (debounced) key that you pressed, you'd get one output. For the second (non-debounced) key, you'd get a huge multitude of presses over the course of 5ms. On your program, you'd see a bunch of random presses form the second key you hit followed by a single press from the first key you actually hit. The time interval would be roughly 3-12ms until you stopped receiving output.
Since you get only one "press" for each key you hit, then you know that they are both debounced.
What does this have to do with latency? unfortunately, it's sorta-hard to measure latency. Here is what actually happens:
You press two keys more or less simultaneously. The microcontroller starts to see some action at those nodes on the matrix and runs some debounce. about 10ms later, the microcontroller sends both outputs at more or less the same time.
The poll rate does matter, because it helps get a really fine grain on when the keyboard can output useful info to the computer. I would say you don't gain too much from this, as anything the computer sees from the KB controller will be at least 10ms later than when you performed the action (It's something like upper bound of (debounce time) + (maximum polling latency), lower bound (debounce time) + (minimum polling latency (theoretically 0?)) for the time range, having better polling rate will improve your 'polling latency' measure (as I call it) and will up the upper bound a little.
I don't know too much about the USB spec, so I don't know how long it takes for the computer to "do something" about a USB input once it's polled the USB controller and registered an event, but that would add to the "polling latency" measure.
You also run into issues with how fast the KB scans the matrix. Due to debounce times being so long, little is gained from scanning the matrix faster than 200 Hz or so.
If you had a keyboard that didn't need debouncing, you could dramatically speed up the keyboard to near realtime. The proposed plan for hall effect will scan the matrix at (hopefully) about 100,000 Hz (three orders of magnitude faster than most cherry KBs) and since the switches are schmitt-triggered, they don't need debouncing (and instead have a little hysteresis). In such a case, the USB poll rate is the limiting point, as it's "only" 1000 Hz

In such a KB, you'd press a key (past the actuation point), the microcontroller would register it at most 100,000 ^ -1 seconds later and the computer would receive the event at most 1000 ^ -1 seconds after that. It would be quite hard to get better performance than this.
So I think you are still seeing debounce and therefore latency. I think debounce times for cherry MX switches are somewhere between 5 ms and 12 ms, but I would have to look up that value to be sure.
another thing:
Only that would make sense because debounce serves solely to prevent the creation of multiple messages from a single switch being actuated multiple times during a very short time frame (caused by the mechanical nature of switches). There is no reason to debounce the whole matrix.
This is not the case, which explains your mistaken conclusion. When you actuate a contact-based switch (and almost any type of switch there is) the switch will "bounce" between the open and closed states before settling in the closed position. Here is a picture form wikipedia:

So it serves to distinguish a single actuation from a single keypress, not to distinguish multiple keypresses in rapid succession. You need this for multiple switches to avoid the behaviour I explain in the beginning of my reply. The most common indicator of an improperly-debounced switch is that you'll get multiple actuations on a single keypress. To prevent this, you must debounce every switch. This is indeed the mechanical nature of switches and occurs for almost every switch.
I recommend Haata's presentation from keycon
t=5126
(his presentation ends at roughly 1:53:54
You can also red his slides
here