Tensor Operations
Tensor Creation
Dimension Operations
Add a new dimension
unsqueeze()
t = torch.rand((3,4)) ## shaped [3, 4]
t_new = t.unsqueeze(0) ## shaped [1, 3, 4]
t_new = t.unsqueeze(1) ## shaped [3, 1, 4]
t_new = t.unsqueeze(2) ## shaped [3, 4, 1]
view()
Activate Function
Sigmoid
Computes the logistic sigmoid function of the elements of input.
log_softmax
def log_softmax(input: torch.Tensor, dim: int = -1) -> torch.Tensor:
"""
Compute log_softmax
Args:
input: Input tensor
dim: Dimension along which to compute log_softmax
(only -1 or last dim supported)
Returns:
Tensor with log_softmax applied along the specified dimension
"""
if dim != -1 and dim != input.ndim - 1:
raise ValueError(
"This implementation only supports log_softmax along the last dimension"
)
# Flatten all dimensions except the last one
original_shape = input.shape
input_2d = input.reshape(-1, input.shape[-1])
input_2d = input_2d.contiguous()
n_rows, n_cols = input_2d.shape
# calculate max per row for numerical stability
max_per_row = torch.max(input_2d, dim=-1, keepdim=True).values
input_stable = input_2d - max_per_row
exp_input = torch.exp(input_stable)
sum_exp = torch.sum(exp_input, dim=-1, keepdim=True)
log_sum_exp = torch.log(sum_exp)
log_softmax_2d = input_stable - log_sum_exp - max_per_row
log_softmax_output = log_softmax_2d.reshape(original_shape)
return log_softmax_output
Compute the maximum value in the input tensor and subtract it from all elements for numerical stability.
calculate the sum of exponentials of the values.
Finally, compute the log-softmax values using the stabilized inputs.
CUDAGraph
static tensor, why not dynamic tensor?
When using CUDA Graphs, the operations and data involved in the graph must remain static to ensure correct execution. Dynamic tensors, which can change shape or size during runtime, would violate this requirement. CUDA Graphs capture a fixed sequence of operations and memory accesses, so any changes to tensor dimensions would lead to inconsistencies and potential errors during execution. Therefore, only static tensors with fixed shapes can be used within CUDA Graphs to maintain the integrity of the captured computation.
why need buffer tensor?
Buffer tensors are used in CUDA Graphs to provide pre-allocated memory space for dynamic data that may change during the execution of the graph. Since CUDA Graphs require static memory allocations for efficient execution, buffer tensors act as placeholders that can hold varying data without altering the overall structure of the graph. This allows for flexibility in handling inputs or intermediate results while still adhering to the constraints of CUDA Graphs, ensuring that the graph can be executed efficiently without the overhead of dynamic memory allocation during runtime.