feat: Support include_null and order_by in ArrayCollect for Spark Connect [pyspark]

### Is your feature request related to a problem?

Yes. When using `collect()` with `include_null=True` or `order_by`, the PySpark 
compiler raises `UnsupportedOperationError` before the query reaches the backend:

```python
# ibis/backends/sql/compilers/pyspark.py:432-436
def visit_ArrayCollect(self, op, *, arg, where, order_by, include_null, distinct):
    if include_null:
        raise com.UnsupportedOperationError(
            "`include_null=True` is not supported by the pyspark backend"
        )
```
This blocks usage even when the backend (via Spark Connect) actually supports
these features through the protocol.

### What is the motivation behind your request?

The Spark Connect protocol supports an ignore_nulls parameter in
UnresolvedFunction that controls null handling in aggregations like
collect_list/collect_set.

Spark Connect implementations like [Sail](https://github.com/lakehq/sail)
(a Rust-native Spark-compatible engine) already support this parameter.

Currently, 12 out of 16 test_collect parameter combinations are marked
as notimpl for pyspark due to this limitation. Enabling support would
significantly improve test coverage for Spark Connect backends.

Possible approaches:

Detect Spark Connect mode (IS_SPARK_REMOTE or checking session type) and use DataFrame API instead of SQL for these operations
Allow the operation to pass through when in Spark Connect mode, since the protocol supports it even if SQL syntax doesn't
The order_by parameter has the same limitation - supported in the protocol
but blocked by the compiler.

### Describe the solution you'd like

Detect Spark Connect mode in the PySpark backend (e.g., checking if 
`session._sc` is None or using `IS_SPARK_REMOTE` environment variable) 
and allow `include_null` and `order_by` parameters to pass through when 
running via Spark Connect, since the protocol supports these features 
even though SQL syntax doesn't.

### What version of ibis are you running?

11

### What backend(s) are you using, if any?

pyspark (via Spark Connect)

### Code of Conduct

- [x] I agree to follow this project's Code of Conduct

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Support include_null and order_by in ArrayCollect for Spark Connect [pyspark] #11859

Is your feature request related to a problem?

What is the motivation behind your request?

Describe the solution you'd like

What version of ibis are you running?

What backend(s) are you using, if any?

Code of Conduct

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: Support include_null and order_by in ArrayCollect for Spark Connect [pyspark] #11859

Description

Is your feature request related to a problem?

What is the motivation behind your request?

Describe the solution you'd like

What version of ibis are you running?

What backend(s) are you using, if any?

Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions