k means klassifiert nicht richtig

MathGenie123 · Montag 30. Mai 2022, 15:16

leider habe ich keinen plan

#!/usr/bin/env python
# coding: utf-8

# In[6]:


import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs


# In[12]:


data = make_blobs(n_samples=200, n_features=2, 
                           centers=4, cluster_std=1.8,random_state=101)

plt.scatter(data[0][:,0],data[0][:,1],c=data[1],cmap='rainbow')
feature,cluster = data


# In[30]:



indices = np.random.choice(len(feature),k)
initial = np.copy(feature[initialmeans])
point_classification = np.full(len(feature),-1)



# In[57]:


def kmeans(data, k, centroids):
    alldistances = np.array([np.linalg.norm(data - centroid, axis=1)
                             for centroid in centroids])
    assert alldistances.shape == (k, len(data))
    point_classification = np.argmin(alldistances, axis=0)
    new_centroids = np.array([np.mean(data[point_classification == pointclass],
                                      axis=0)
                              for pointclass in np.arange(k)])
    return point_classification, new_centroids
    
point_classification, new_centroids = kmeans(feature, 4, initial)   


# Ergebnisse betrachten:

fig, ax = plt.subplots(figsize=(10,10))
ax.set_xlim(np.min(feature[:,0])-0.2, np.max(feature[:,0])+0.2)

__deets__ · Montag 30. Mai 2022, 15:18

Ich fange an, an deinem Usernamen zu zweifeln...

MathGenie123 · Montag 30. Mai 2022, 15:46

warum

MathGenie123 · Montag 30. Mai 2022, 16:03

sorry habe leider den verbuggten gepostet

Code: Alles auswählen

#!/usr/bin/env python
# coding: utf-8

# In[1]:


import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs


# In[18]:


data = make_blobs(n_samples=200, n_features=2, 
                           centers=4, cluster_std=1.8,random_state=101)

plt.scatter(data[0][:,0],data[0][:,1],c=data[1],cmap='rainbow')
feature,cluster = data


# In[19]:


k = 4
random_indices = np.random.choice(len(feature), k)
initial_cs = feature[random_indices]
cs = np.copy(initial_cs)


# In[20]:


def kmeans(data, k, centroids):
    alldistances = np.array([np.linalg.norm(data - centroid, axis=1)
                             for centroid in centroids])
    assert alldistances.shape == (k, len(data))
    point_classification = np.argmin(alldistances, axis=0)
    new_centroids = np.array([np.mean(data[point_classification == pointclass],
                                      axis=0)
                              for pointclass in np.arange(k)])
    return point_classification, new_centroids
    
point_classification, new_centroids = kmeans(feature, 4, initial)   


# Ergebnisse betrachten:

fig, ax = plt.subplots(figsize=(10,10))
ax.set_xlim(np.min(feature[:,0])-0.2, np.max(feature[:,0])+0.2)
ax.set_ylim(np.min(feature[:,1])-0.2, np.max(feature[:,1])+0.2)
ax.scatter(feature[:,0], feature[:,1] , c =point_classification)
ax.plot(new_centroids[:,0],new_centroids[:,1],'+', c = 'black',markersize = 20)

also weiß nicht wieso kmeans so klassifizert,

__deets__ · Montag 30. Mai 2022, 16:07

Na mit "Genie" ist nicht so weit her, oder? Und diese Frage hier ist nun auch wirklich nicht genial, wenn genial schlecht gestellt. Ein "geht nicht"-Titel, und Code hingeklatscht. Und nun? Kann ja sein, dass sich hier jemand die Muehe macht, da tief einzusteigen. Aber muss nicht. Du erhoehst deine Chancen auf Hilfe substantiell, wenn du deine Fragen *klar* stellst.

MathGenie123 · Montag 30. Mai 2022, 16:15

soll ich die frage neu posten oder sollte ich mir dies fürs nächste mal merken ?

__deets__ · Montag 30. Mai 2022, 16:17

Was glaubst du? Denk mal hart nach.

MathGenie123 · Montag 30. Mai 2022, 16:34

ich habe eine neue frage gestellt ausführlicher