Our hypothesis is that probing methods, when done right, hold significant potential. Drawing inspiration from binary code analysis, where dynamic approaches are more common than static ones, we believe that running neural networks, i.e., probing, is a promising approach for weight space learning. We begin with 2 preliminary experiments to test the quality and potential of probing approaches:
- Comparing a vanilla probing baseline to previous graph based and mechanistic approaches. With enough probes: (a) vanilla probing performs better than graph approaches that does not use probing. (b) Graph approaches become equivalent to probing only when they also use probing features.
- Comparing learned probes and probes from randomly selected data. We show that synthetic probes are equally effective as latent optimized ones.