How to retain only the first occurrence of strictly increasing values in each row of a NumPy array?

The post content has been automatically edited by the Moderator Agent for consistency and clarity.

I'm working on computing first-passage probabilities and need a vectorized solution using NumPy (without explicit loops) to keep only the first occurrence of a strictly increasing number in each row while setting all other elements to 0.

For example, given the array:

PYTHON
arr = np.array([
    [1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5],
    [1, 1, 2, 2, 2, 3, 2, 2, 3, 3, 3, 4, 4],
    [3, 2, 1, 2, 1, 1, 2, 3, 4, 5, 4, 3, 2]
])

The desired output is:

PYTHON
out = np.array([
    [1, 0, 0, 2, 0, 0, 3, 0, 0, 4, 0, 5, 0],
    [1, 0, 2, 0, 0, 3, 0, 0, 0, 0, 0, 4, 0],
    [3, 0, 0, 0, 0, 0, 0, 0, 4, 5, 0, 0, 0]
])

I would appreciate a NumPythonic approach that leverages vectorized operations (e.g., using cumulative maximum) to solve this problem. Any help is welcome!

Comments 0

•

Answers 3

•

3 Answers

Answer by StackOverflow TychoAI Agent • 5 months ago

Below is one elegant vectorized solution using NumPy’s cumulative maximum. The idea is to compute the cumulative maximum along each row and then mark an entry if it is the first time that the maximum changed (with the very first element always kept):

PYTHON
import numpy as np

arr = np.array([
    [1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5],
    [1, 1, 2, 2, 2, 3, 2, 2, 3, 3, 3, 4, 4],
    [3, 2, 1, 2, 1, 1, 2, 3, 4, 5, 4, 3, 2]
])

# Compute the cumulative maximum along rows.
cum_max = np.maximum.accumulate(arr, axis=1)

# Create a boolean mask:
# First element in every row is always True.
# For the rest, True only if the cummax is greater than the previous cummax.
mask = np.zeros_like(arr, dtype=bool)
mask[:, 0] = True
mask[:, 1:] = cum_max[:, 1:] > cum_max[:, :-1]

# Build the output array.
out = np.where(mask, arr, 0)
print(out)

Explanation:

np.maximum.accumulate(arr, axis=1) computes the running maximum along each row. It increases only when a new, strictly higher element is encountered.
The mask sets the first element of each row as True. For the rest, an entry is marked True only if the cumulative maximum increased from the previous column; that is, the value represents the first occurrence of a new maximum.
Finally, np.where(mask, arr, 0) keeps only the values corresponding to the first occurrence of an increase and replaces all other elements with zero.

When you run the above code, you get the desired output:

PYTHON
[[1 0 0 2 0 0 3 0 0 4 0 5 0]
 [1 0 2 0 0 3 0 0 0 0 0 4 0]
 [3 0 0 0 0 0 0 0 4 5 0 0 0]]

Tycho is an AI agent, that grounds responses in various sources like documentation, code bases, live discussions, and relevant posts. Want to chat privately with Tycho?

No comments yet.

Answer by EtherealNavigator542 • 6 months ago

Here's one approach:

PYTHON
m = np.hstack(
    (np.ones((arr.shape[0], 1), dtype=bool),
     np.diff(np.fmax.accumulate(arr, axis=1)) >= 1)
    )

out = np.zeros_like(arr)

out[m] = arr[m]

Output:

PYTHON
array([[1, 0, 0, 2, 0, 0, 3, 0, 0, 4, 0, 5, 0],
       [1, 0, 2, 0, 0, 3, 0, 0, 0, 0, 0, 4, 0],
       [3, 0, 0, 0, 0, 0, 0, 0, 4, 5, 0, 0, 0]])

Explanation

Use np.fmax + np.ufunc.accumulate to get running maximum for each row.
Now, check where np.diff is bigger than or equal to 1.
Use np.hstack to prepend a column with True for first column (via np.ones).
Finally, initialize an array with zeros with same shape as arr (via np.zeros_like) and set values for the mask.

No comments yet.

Answer by SaturnianDiscoverer512 • 6 months ago

Maximum can be accumulated per-row:

PYTHON
>>> arr
array([[1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5],
       [1, 1, 2, 2, 2, 3, 2, 2, 3, 3, 3, 4, 4],
       [3, 2, 1, 2, 1, 1, 2, 3, 4, 5, 4, 3, 2]])
>>> np.maximum.accumulate(arr, axis=1)
array([[1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5],
       [1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4],
       [3, 3, 3, 3, 3, 3, 3, 3, 4, 5, 5, 5, 5]])

Then you can easily mask out non-increasing values:

PYTHON
>>> m_arr = np.maximum.accumulate(arr, axis=1)
>>> np.where(np.diff(m_arr, axis=1, prepend=0), arr, 0)
array([[1, 0, 0, 2, 0, 0, 3, 0, 0, 4, 0, 5, 0],
       [1, 0, 2, 0, 0, 3, 0, 0, 0, 0, 0, 4, 0],
       [3, 0, 0, 0, 0, 0, 0, 0, 4, 5, 0, 0, 0]])

No comments yet.

Discussion

No comments yet.

How to retain only the first occurrence of strictly increasing values in each row of a NumPy array?

3 Answers

Discussion

Similar Posts

How can I correctly slice February daily temperature data to compute monthly min, mean, and max in Python?

How can I Configure Chaquopy in a KMM Shared Module to Access Python (NumPy) Code?

Why Are Numpy Where Indices Converted to Python Ints and Causing IndexErrors?