PNDMScheduler has a considerable cost when running StableDiffusionPipeline in diffuser-0.4.1

### Describe the bug

## Enviroment
GPU: A10, CUDA 11.6, cuDNN 8.4.0
Torch: 1.12.1
diffuser: 0.4.1

## Phenomenon
When I ran the StableDiffusionPipeline with fp16 precision, I found the time cost of PNDMScheduler increase a lot after I upgraded the diffuser to 0.4.1. It costs about **4.2 seconds** while unet costs **6.6 seconds**. The time cost of PNDMScheduler in diffuser-0.3.0 can be almost ignored. I wonder what happends with diffuser upgrade.

## Profile 
![image](https://user-images.githubusercontent.com/10826371/194736450-46ebab17-8ea9-41d4-aaa3-50a77a320751.png)

## Code

```python
from diffusers import StableDiffusionPipeline
import time
import torch

pipe = StableDiffusionPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4",
    revision="fp16",
    torch_dtype=torch.float16,
    use_auth_token=True)
pipe = pipe.to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"
start = time.time()
image = pipe(prompt, num_inference_steps=100).images[0]
time_cost = time.time() - start
image.save("astronaut_rides_horse.png")
print(f"Image saved! Total time cost: {time_cost:2f} s")
```

### Reproduction

_No response_

### Logs

_No response_

### System Info

- `diffusers` version: 0.4.1
- Platform: Linux-4.14.0_1-0-0-43-x86_64-with-centos-7.9.2009-Core
- Python version: 3.7.0
- PyTorch version (GPU?): 1.12.1+cu116 (True)
- Huggingface_hub version: 0.10.0
- Transformers version: 4.22.2
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PNDMScheduler has a considerable cost when running StableDiffusionPipeline in diffuser-0.4.1 #785

Describe the bug

Enviroment

Phenomenon

Profile

Code

Reproduction

Logs

System Info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PNDMScheduler has a considerable cost when running StableDiffusionPipeline in diffuser-0.4.1 #785

Description

Describe the bug

Enviroment

Phenomenon

Profile

Code

Reproduction

Logs

System Info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions