In Python, sets are a powerful data structure used to store unique items in an unordered collection. If you're familiar with mathematical sets, you'll find Python sets quite similar.

They are ideal for operations involving comparisons and membership tests, such as finding the differences between items or checking if an item exists within a group.

What Is a Set?

A set in Python is a collection of distinct (non-duplicate) items that are unordered. Unlike lists, which can contain duplicate elements and maintain order, sets automatically remove any duplicate entries and do not preserve the order of elements. This makes sets ideal for situations where the uniqueness of items is a priority and the order is not important.

Syntax

To create a set, you can place multiple items within curly braces, like so:

## Creating a set with initial items
my_set = {'apple', 'banana', 'cherry', 'apple'} 

print(my_set)  # Output: {'banana', 'cherry', 'apple'}

Notice in the example above, 'apple' is listed twice, but when Python creates the set, it automatically removes any duplicate entries. So when we subsequently run print(), we only see the unique items:

Key Operations on Sets

Sets support a variety of operations that can perform mathematical set operations like unions, intersections, and differences:

  • Adding items: You can add items to a set using the add() method.
## Example of a set 
my_set = {'apple', 'banana', 'cherry'} 

# Adding an item
my_set.add('orange') # Outputs the set with 'orange' added, e.g., {'banana', 'cherry', 'apple', 'orange'} 
  • Unions and intersections: These operations combine sets or find common items respectively. Union combines all items from both sets, keeping only unique items, while intersection finds items that are common to both sets.
# Unions and intersections
	set1 = {'apple', 'banana'}
	set2 = {'banana', 'cherry'}
	
	# Union of two sets
	union_set = set1.union(set2)
	print(union_set)  # Outputs the union of set1 and set2, e.g., {'apple', 'banana', 'cherry'}
	
	# Intersection of two sets
	intersection_set = set1.intersection(set2)
	print(intersection_set)  # Outputs the intersection of set1 and set2, e.g., {'banana'}

Why Use Sets?

Sets are particularly useful in scenarios where you need to handle unique items and perform operations like testing for membership, removing duplicates from a sequence, or calculating differences between two collections.

They are optimized for fast membership testing and are often more efficient than lists for this purpose.

"Membership testing" is basically checking to see if a specific item is in a group, like a set or a list. Think of it as looking for a friend in a crowded room. Sets are really good at this kind of search because they are built in a way that lets them quickly say yes or no to the question, "Is this item in the set?" This quick checking ability makes sets faster for these searches compared to lists, especially when there are a lot of items to look through. This is why we often prefer using sets when we need to frequently check if certain items are present.

Consider a scenario where you need to find unique visitors to a website from a list of visitor IDs; a set is perfect for quickly assembling a unique collection without any duplicates.