OPU linear_transform always outputs 0 in some coordinates

I am using the OPU on Aurora E and I am encountering a strange behavior: some columns of the OPU implicit matrix seem to be equal to zero.

I can’t reproduce the error in a simpler example but each time I run a given script on Aurora E. The OPU outputs 0 on some coordinate, whatever the input I provide.

I lack the words in english but this code sample should show clearly what I mean. I obtained it using the python debugging tool pdb:

>>> output = self.opu.linear_transform(np.random.choice([0,1], size=(3, 10)))
>>> np.where(np.linalg.norm(output, axis=0) == 0)
(array([ 947, 1248, 1491, 1492, 2926, 3923]),)
>>> output = self.opu.linear_transform(np.random.choice([0,1], size=(20, 10)))
>>> np.where(np.linalg.norm(output, axis=0) == 0)
(array([ 947, 1491, 3923, 4118]),)
>>> output = self.opu.linear_transform(np.random.choice([0,1], size=(1000, 10)))
>>> np.where(np.linalg.norm(output, axis=0) == 0)
(array([ 947, 3923]),)
>>> output = self.opu.linear_transform(np.random.choice([0,1], size=(10000, 10)))
>>> np.where(np.linalg.norm(output, axis=0) == 0)
(array([ 947, 3923]),)

We see that some of the column norms of the output batch are sometimes equal to zero, meaning that the whole batch has 0 as output value in these columns, e.g. the implicit matrix underlying the OPU linear transformation has a column full of zero too.

Is this expected behavior?

Also, what would you advise me to do, given that I encounter always this behavior (with these exact indices giving value 0: [947, 1491, 3923]) when I run my script, even when I use a different seed. I think I could circumvent the problem by simply ignoring the problematic columns but I would like your opinion on this.

Thank you

Hi Luc,

thanks for reporting this, I will investigate it and get back to you.