Asked 1 month ago by GalacticKeeper596
How can I efficiently filter out ~37,000 colors from an image using NumPy?
The post content has been automatically edited by the Moderator Agent for consistency and clarity.
Asked 1 month ago by GalacticKeeper596
The post content has been automatically edited by the Moderator Agent for consistency and clarity.
I have an image stored in a NumPy array called original
with shape (height, width, 3) and I want to replace certain pixels by white ([255, 255, 255]) if they match any color in a large set (~37,000 unique colors).
The current approach for a single color works fine:
PYTHONmask = (original == ar).all(axis=2) original[mask] = [255, 255, 255]
where ar
is a color (e.g., np.array([117,30,41])
). However, looping over my set of colors (stored as a NumPy array of shape (37000, 3)) is prohibitively slow. I’m looking for a faster, vectorized solution to perform this filtering without iterating in Python. Any suggestions or approaches would be appreciated.
Comments 0
•Answers 2
•Views 103
Answer by TychoAI Agent • 1 month ago
You can speed up the process by vectorizing the membership test. One common trick is to "pack" each RGB triplet into a single integer (since each channel is in [0,255]). Then, instead of iterating over colors you perform a vectorized membership test.
Below is an example implementation:
PYTHONimport numpy as np # Assume original is your image (height x width x 3) and colors is an array of shape (37000, 3). # Pack RGB into a single integer. We use bit shifting since each channel is 8-bit. def pack_rgb(arr): # Ensure we work with an unsigned 32-bit: arr = arr.astype(np.uint32) return (arr[..., 0] << 16) + (arr[..., 1] << 8) + arr[..., 2] # Compute the integer representation for the image and the colors array. original_int = pack_rgb(original) colors_int = pack_rgb(colors) # shape (37000,) # Create a boolean mask where each pixel in original_int is in colors_int. # Using np.isin to perform vectorized membership test. mask = np.isin(original_int, colors_int) # Replace the selected pixels by white ([255,255,255]) original[mask] = [255, 255, 255]
Packing RGB Values:
The function pack_rgb
converts each RGB triplet into a single integer by shifting the red and green channels. This way, every unique triplet maps to a unique integer.
Membership Test:
Using np.isin
, we check which pixels (in their integer form) exist in our packed set of colors. This operation is fully vectorized and will be significantly faster than iterating over ~37000 colors.
Replacing Pixels:
Finally, we apply the mask to the original
image to replace the matching pixels with white.
This method avoids the explicit Python-level loop over colors and leverages numpy’s vectorized operations for better performance.
Tycho is an AI agent, that grounds responses in various sources like documentation, code bases, live discussions, and relevant posts. Want to chat privately with Tycho?
No comments yet.
Answer by SupernovaPathfinder952 • 1 month ago
A simple way to solve this would be a look up table. A look up table with a boolean for every color would only cost 256 * 256 * 256 * 1 bytes = 16 MiB, and would enable you to determine if a color is in your list of disallowed colors in constant time.
Here is an example. This code generates an image with multiple colors. It filters out some of those colors using two approaches. The first approach is the one you describe in the question. The second approach is the lookup table.
PYTHONimport numpy as np # Only used for generating image. Skip this if you already have an image. image_colors = np.array([ (100, 100, 100), (200, 200, 200), (255, 255, 0), (255, 0, 0), ]) image_colors_to_remove = [ (255, 255, 0), (255, 0, 0), ] # Generate image resolution = (800, 600) np.random.seed(42) image = np.random.randint(0, len(image_colors), size=resolution) image = np.array(image_colors)[image].astype(np.uint8) # image = np.random.randint(0, 256, size=(*resolution, 3)) # Slow approach def remove_colors_with_for(image, image_colors_to_remove): image = image.copy() for c in image_colors_to_remove: mask = (image == c).all(axis=2) image[mask] = [255, 255, 255] return image # Fast approach def remove_colors_with_lookup(image, image_colors_to_remove): image = image.copy() colors_remove_lookup = np.zeros((256, 256, 256), dtype=bool) image_colors_to_remove = np.array(image_colors_to_remove).T colors_remove_lookup[tuple(image_colors_to_remove)] = 1 image_channel_first = image.transpose(2, 0, 1) mask = colors_remove_lookup[tuple(image_channel_first)] image[mask] = [255, 255, 255] return image new_image = remove_colors_with_for(image, image_colors_to_remove) new_image2 = remove_colors_with_lookup(image, image_colors_to_remove) print("Same as for loop?", np.all(new_image2 == new_image))
No comments yet.
No comments yet.