在这里,我们将给大家分享关于Pythonnumpy模块-random()实例源码的知识,让您更了解numpy的random模块的本质,同时也会涉及到如何更有效地9.获得图片路径,构造出训练集和验证集,
在这里,我们将给大家分享关于Python numpy 模块-random() 实例源码的知识,让您更了解numpy的random模块的本质,同时也会涉及到如何更有效地9. 获得图片路径,构造出训练集和验证集,同时构造出相同人脸和不同人脸的测试集,将结果存储为.csv 格式 1.random.shuffle (数据清洗) 2.random.sample (从数据...、dask.array.from_array(np.random.random) 和 dask.array.random.random 有什么区别、Math.random () 和 Random.nextInt () 区别、np.random.randn()、np.random.rand()、np.random.randint()的内容。
本文目录一览:- Python numpy 模块-random() 实例源码(numpy的random模块)
- 9. 获得图片路径,构造出训练集和验证集,同时构造出相同人脸和不同人脸的测试集,将结果存储为.csv 格式 1.random.shuffle (数据清洗) 2.random.sample (从数据...
- dask.array.from_array(np.random.random) 和 dask.array.random.random 有什么区别
- Math.random () 和 Random.nextInt () 区别
- np.random.randn()、np.random.rand()、np.random.randint()
Python numpy 模块-random() 实例源码(numpy的random模块)
Python numpy 模块,random() 实例源码
我们从Python开源项目中,提取了以下50个代码示例,用于说明如何使用numpy.random()。
- def isotropic_mean_shift(self):
- """normalized last mean shift,under random selection N(0,I)
- distributed.
- Caveat: while it is finite and close to sqrt(n) under random
- selection,the length of the normalized mean shift under
- *systematic* selection (e.g. on a linear function) tends to
- infinity for mueff -> infty. Hence it must be used with great
- care for large mueff.
- """
- z = self.sm.transform_inverse((self.mean - self.mean_old) /
- self.sigma_vec.scaling)
- # works unless a re-parametrisation has been done
- # assert Mh.vequals_approximately(z,np.dot(es.B,(1. / es.D) *
- # np.dot(es.B.T,(es.mean - es.mean_old) / es.sigma_vec)))
- z /= self.sigma * self.sp.cmean
- z *= self.sp.weights.mueff**0.5
- return z
- def _random_vec(sites, ldim, randstate=None, dtype=np.complex_):
- """Returns a random complex vector (normalized to ||x||_2 = 1) of shape
- (ldim,) * sites,i.e. a pure state with local dimension `ldim` living on
- `sites` sites.
- :param sites: Number of local sites
- :param ldim: Local ldimension
- :param randstate: numpy.random.RandomState instance or None
- :returns: numpy.ndarray of shape (ldim,) * sites
- >>> psi = _random_vec(5,2); psi.shape
- (2,2,2)
- >>> np.abs(np.vdot(psi,psi) - 1) < 1e-6
- True
- """
- shape = (ldim, ) * sites
- psi = _randfuncs[dtype](shape, randstate=randstate)
- psi /= np.linalg.norm(psi)
- return psi
- def random_mps(sites, rank, force_rank=False):
- """Returns a randomly choosen normalized matrix product state
- :param sites: Number of sites
- :param ldim: Local dimension
- :param rank: Rank
- :param randstate: numpy.random.RandomState instance or None
- :param force_rank: If True,the rank is exaclty `rank`.
- Otherwise,it might be reduced if we reach the maximum sensible rank.
- :returns: randomly choosen matrix product (pure) state
- >>> mps = random_mps(4,10,force_rank=True)
- >>> mps.ranks,mps.shape
- ((10,10),((2,),(2,)))
- >>> mps.canonical_form
- (0,4)
- >>> round(abs(1 - mp.inner(mps,mps)),10)
- 0.0
- """
- return random_mpa(sites, normalized=True, randstate=randstate,
- force_rank=force_rank, dtype=np.complex_)
- def _standard_normal(shape, randstate=np.random, dtype=np.float_):
- """Generates a standard normal numpy array of given shape and dtype,i.e.
- this function is equivalent to `randstate.randn(*shape)` for real dtype and
- `randstate.randn(*shape) + 1.j * randstate.randn(shape)` for complex dtype.
- :param tuple shape: Shape of array to be returned
- :param randstate: An instance of :class:`numpy.random.RandomState` (default is
- ``np.random``))
- :param dtype: ``np.float_`` (default) or `np.complex_`
- Returns
- -------
- A: An array of given shape and dtype with standard normal entries
- """
- if dtype == np.float_:
- return randstate.randn(*shape)
- elif dtype == np.complex_:
- return randstate.randn(*shape) + 1.j * randstate.randn(*shape)
- else:
- raise ValueError(''{} is not a valid dtype.''.format(dtype))
- def setup_params(self, data):
- keys = (''fun_data'', ''fun_y'', ''fun_ymin'', ''fun_ymax'')
- if not any(self.params[k] for k in keys):
- raise PlotnineError(''No summary function'')
- if self.params[''fun_args''] is None:
- self.params[''fun_args''] = {}
- if ''random_state'' not in self.params[''fun_args'']:
- if self.params[''random_state'']:
- random_state = self.params[''random_state'']
- if random_state is None:
- random_state = np.random
- elif isinstance(random_state, int):
- random_state = np.random.RandomState(random_state)
- self.params[''fun_args''][''random_state''] = random_state
- return self.params
- def bootstrap_statistics(series, statistic, n_samples=1000,
- confidence_interval=0.95, random_state=None):
- """
- Default parameters taken from
- R''s Hmisc smean.cl.boot
- """
- if random_state is None:
- random_state = np.random
- alpha = 1 - confidence_interval
- size = (n_samples, len(series))
- inds = random_state.randint(0, len(series), size=size)
- samples = series.values[inds]
- means = np.sort(statistic(samples, axis=1))
- return pd.DataFrame({''ymin'': means[int((alpha/2)*n_samples)],
- ''ymax'': means[int((1-alpha/2)*n_samples)],
- ''y'': [statistic(series)]})
- def setup_params(self, int):
- random_state = np.random.RandomState(random_state)
- self.params[''fun_args''][''random_state''] = random_state
- return self.params
- def mahalanobis_norm(self, dx):
- """compute the Mahalanobis norm that is induced by the adapted
- sample distribution,covariance matrix ``C`` times ``sigma**2``,
- including ``sigma_vec``. The expected Mahalanobis distance to
- the sample mean is about ``sqrt(dimension)``.
- Argument
- --------
- A *genotype* difference `dx`.
- Example
- -------
- >>> import cma,numpy
- >>> es = cma.CMAEvolutionStrategy(numpy.ones(10),1)
- >>> xx = numpy.random.randn(2,10)
- >>> d = es.mahalanobis_norm(es.gp.geno(xx[0]-xx[1]))
- `d` is the distance "in" the true sample distribution,
- sampled points have a typical distance of ``sqrt(2*es.N)``,
- where ``es.N`` is the dimension,and an expected distance of
- close to ``sqrt(N)`` to the sample mean. In the example,
- `d` is the Euclidean distance,because C = I and sigma = 1.
- """
- return sqrt(sum((self.D**-1. * np.dot(self.B.T, dx / self.sigma_vec))**2)) / self.sigma
- def __call__(self, x, inverse=False): # function when calling an object
- """Rotates the input array `x` with a fixed rotation matrix
- (``self.dicMatrices[''str(len(x))'']``)
- """
- x = np.array(x, copy=False)
- N = x.shape[0] # can be an array or matrix,Todo: accept also a list of arrays?
- if str(N) not in self.dicMatrices: # create new N-basis for once and all
- rstate = np.random.get_state()
- np.random.seed(self.seed) if self.seed else np.random.seed()
- B = np.random.randn(N, N)
- for i in range(N):
- for j in range(0, i):
- B[i] -= np.dot(B[i], B[j]) * B[j]
- B[i] /= sum(B[i]**2)**0.5
- self.dicMatrices[str(N)] = B
- np.random.set_state(rstate)
- if inverse:
- return np.dot(self.dicMatrices[str(N)].T, x) # compute rotation
- else:
- return np.dot(self.dicMatrices[str(N)], x) # compute rotation
- # Use rotate(x) to rotate x
- def elli(self, rot=0, xoffset=0, cond=1e6, actuator_noise=0.0, both=False):
- """Ellipsoid test objective function"""
- if not isscalar(x[0]): # parallel evaluation
- return [self.elli(xi, rot) for xi in x] # Could save 20% overall
- if rot:
- x = rotate(x)
- N = len(x)
- if actuator_noise:
- x = x + actuator_noise * np.random.randn(N)
- ftrue = sum(cond**(np.arange(N) / (N - 1.)) * (x + xoffset)**2)
- alpha = 0.49 + 1. / N
- beta = 1
- felli = np.random.rand(1)[0]**beta * ftrue * \\
- max(1, (10.**9 / (ftrue + 1e-99))**(alpha * np.random.rand(1)[0]))
- # felli = ftrue + 1*np.random.randn(1)[0] / (1e-30 +
- # np.abs(np.random.randn(1)[0]))**0
- if both:
- return (felli, ftrue)
- else:
- # return felli # possibly noisy value
- return ftrue # + np.random.randn()
- def __init__(
- self,
- batch_size,
- training_epoch_size=None,
- no_stub_batch=False,
- shuffle=None,
- seed=None,
- buffer_size=2,
- ):
- self.batch_size = batch_size
- self.training_epoch_size = training_epoch_size
- self.no_stub_batch = no_stub_batch
- self.shuffle = shuffle
- if seed is not None:
- self.random = np.random.RandomState(seed)
- else:
- self.random = np.random
- self.buffer_size = buffer_size
- def test_process_image(compress, out_dir):
- numpy.random.seed(8)
- image = numpy.random.randint(256, size=(16, 16, 3), dtype=numpy.uint16)
- Meta = {
- "DNA": "/User/jcaciedo/LUAD/dna.tiff",
- "ER": "/User/jcaciedo/LUAD/er.tiff",
- "Mito": "/User/jcaciedo/LUAD/mito.tiff"
- }
- compress.stats["illum_correction_function"] = numpy.ones((16,16,3))
- compress.stats["upper_percentiles"] = [255, 255, 255]
- compress.stats["lower_percentiles"] = [0, 0, 0]
- compress.process_image(0, image, Meta)
- filenames = glob.glob(os.path.join(out_dir,"*"))
- real_filenames = [os.path.join(out_dir, x) for x in ["dna.png", "er.png", "mito.png"]]
- filenames.sort()
- assert real_filenames == filenames
- for i in range(3):
- data = scipy.misc.imread(filenames[i])
- numpy.testing.assert_array_equal(image[:,:,i], data)
- def test_apply(corrector):
- image = numpy.random.randint(256, size=(24, 24, dtype=numpy.uint16)
- illum_corr_func = numpy.random.rand(24, 3)
- illum_corr_func /= illum_corr_func.min()
- corrector.illum_corr_func = illum_corr_func
- corrected = corrector.apply(image)
- expected = image / illum_corr_func
- assert corrected.shape == (24, 3)
- numpy.testing.assert_array_equal(corrected, expected)
- def estimate(self, context, data):
- pdf = ScaleMixture()
- alpha = context.prior.alpha
- beta = context.prior.beta
- d = context._d
- if len(data.shape) == 1:
- data = data[:, numpy.newaxis]
- a = alpha + 0.5 * d * len(data.shape)
- b = beta + 0.5 * data.sum(-1) ** 2
- s = numpy.clip(numpy.random.gamma(a, 1. / b), 1e-20, 1e10)
- pdf.scales = s
- context.prior.estimate(s)
- pdf.prior = context.prior
- return pdf
- def testRandom(self):
- ig = InverseGaussian(1., 1.)
- samples = ig.random(1000000)
- mu = numpy.mean(samples)
- var = numpy.var(samples)
- self.assertAlmostEqual(ig.mu, mu, delta=1e-1)
- self.assertAlmostEqual(ig.mu ** 3 / ig.shape, var, delta=1e-1)
- ig = InverseGaussian(3., 6.)
- samples = ig.random(1000000)
- mu = numpy.mean(samples)
- var = numpy.var(samples)
- self.assertAlmostEqual(ig.mu, delta=5e-1)
- def testRandom(self):
- from scipy.special import kv
- from numpy import sqrt
- a = 2.
- b = 1.
- p = 1
- gig = GeneralizedInverseGaussian(a, b, p)
- samples = gig.random(10000)
- mu_analytical = sqrt(b) * kv(p + 1, sqrt(a * b)) / (sqrt(a) * kv(p, sqrt(a * b)))
- var_analytical = b * kv(p + 2, sqrt(a * b)) / a / kv(p, sqrt(a * b)) - mu_analytical ** 2
- mu = numpy.mean(samples)
- var = numpy.var(samples)
- self.assertAlmostEqual(mu_analytical, delta=1e-1)
- self.assertAlmostEqual(var_analytical, delta=1e-1)
- def _check_random_state(seed):
- """Turn seed into a np.random.RandomState instance
- If seed is None,return the RandomState singleton used by np.random.
- If seed is an int,return a new RandomState instance seeded with seed.
- If seed is already a RandomState instance,return it.
- Otherwise raise ValueError.
- """
- if seed is None or seed is np.random:
- return np.random.mtrand._rand
- if isinstance(seed, int):
- return np.random.RandomState(seed)
- if isinstance(seed, np.random.RandomState):
- return seed
- raise ValueError(''%r cannot be used to seed a numpy.random.RandomState''
- '' instance'' % seed)
- def test_poisson(self):
- """Tests that Gibbs sampling the initial process yields a Poisson process."""
- nt = 50
- ns = 1000
- num_giter = 5
- net = self.poisson
- times = []
- for i in range(ns):
- arrv = net.sample (nt)
- obs = arrv.subset (lambda a,e: a.is_last_in_queue(e), copy_evt)
- gsmp = net.gibbs_resample (arrv, num_giter)
- resampled = gsmp[-1]
- evts = resampled.events_of_task (2)
- times.append (evts[0].d)
- exact_sample = [ numpy.random.gamma (shape=3, scale=0.5) for i in xrange (ns) ]
- times.sort()
- exact_sample.sort()
- print summarize(times)
- print summarize(exact_sample)
- netutils.check_quantiles (self, exact_sample, times, ns)
- def _sample_testset(self, data):
- test_sample = self.test_sample
- if not isinstance(test_sample, int):
- return data
- userid, Feedback = self.fields.userid, self.fields.Feedback
- if test_sample > 0:
- sampled = (data.groupby(userid, sort=False, group_keys=False)
- .apply(random_choice, test_sample, self.random_state or np.random))
- elif test_sample < 0: #leave only the most negative Feedback from user
- idx = (data.groupby(userid, sort=False)[Feedback]
- .nsmallest(-test_sample).index.get_level_values(1))
- sampled = data.loc[idx]
- else:
- sampled = data
- return sampled
- def mahalanobis_norm(self, dx / self.sigma_vec))**2)) / self.sigma
- def __call__(self, x) # compute rotation
- # Use rotate(x) to rotate x
- def elli(self, ftrue)
- else:
- # return felli # possibly noisy value
- return ftrue # + np.random.randn()
- def elastic_transform(image, alpha, sigma, random_state=None):
- """Elastic deformation of images as described in [Simard2003]_.
- .. [Simard2003] Simard,Steinkraus and Platt,"Best Practices for
- Convolutional Neural Networks applied to Visual Document Analysis",in
- Proc. of the International Conference on Document Analysis and
- Recognition,2003.
- """
- if random_state is None:
- random_state = np.random.RandomState(None)
- shape = image.shape[1:];
- dx = gaussian_filter((random_state.rand(*shape) * 2 - 1), mode="constant", cval=0) * alpha
- dy = gaussian_filter((random_state.rand(*shape) * 2 - 1), cval=0) * alpha
- x, y = np.meshgrid(np.arange(shape[1]), np.arange(shape[0]))
- indices = np.reshape(y+dy, (-1, 1)), np.reshape(x+dx, 1))
- #return map_coordinates(image,indices,order=1).reshape(shape)
- res = np.zeros_like(image);
- for i in xrange(image.shape[0]):
- res[i] = map_coordinates(image[i], indices, order=1).reshape(shape)
- return res;
- def load_augment(fname, w, h, aug_params=no_augmentation_params,
- transform=None, sigma=0.0, color_vec=None):
- """Load augmented image with output shape (w,h).
- Default arguments return non augmented image of shape (w,h).
- To apply a fixed transform (color augmentation) specify transform
- (color_vec).
- To generate a random augmentation specify aug_params and sigma.
- """
- img = load_image(fname)
- if transform is None:
- img = perturb(img, augmentation_params=aug_params, target_shape=(w, h))
- else:
- img = perturb_fixed(img, tform_augment=transform, h))
- np.subtract(img, MEAN[:, np.newaxis, np.newaxis], out=img)
- np.divide(img, STD[:, out=img)
- img = augment_color(img, sigma=sigma, color_vec=color_vec)
- return img
- def mahalanobis_norm(self, dx / self.sigma_vec))**2)) / self.sigma
- def __call__(self, N)
- for i in xrange(N):
- for j in xrange(0, x) # compute rotation
- # Use rotate(x) to rotate x
- def elli(self, ftrue)
- else:
- # return felli # possibly noisy value
- return ftrue # + np.random.randn()
- def form_set_data(labels, max_num, verbose=False):
- """Generate label sets from sample labels.
- For each sample,generate a set by random sampling within the same class.
- Set is a tensor
- """
- # group sample ids based on label.
- label_ids = {}
- for idx in range(labels.size):
- if labels[idx] not in label_ids:
- label_ids[labels[idx]] = []
- label_ids[labels[idx]].append(idx)
- set_ids = {}
- for idx in range(labels.size):
- samp_ids = label_ids[labels[idx]][:]
- samp_num = min(max_num, len(samp_ids))
- set_ids[idx] = rand.sample(samp_ids, samp_num)
- if verbose:
- print "set {} formed.".format(idx)
- return set_ids
- def rvs(cls, a, size=1, random_state=None):
- """Draw random variates.
- Parameters
- ----------
- a : float or array-like
- b : float or array-like
- size : int,optional
- random_state : RandomState,optional
- Returns
- -------
- np.array
- """
- u = ss.uniform.rvs(loc=a, scale=b-a, size=size, random_state=random_state)
- x = np.exp(u)
- return x
- def rvs(cls, random_state=None):
- """Get random variates.
- Parameters
- ----------
- b : float
- size : int or tuple,optional
- Returns
- -------
- arraylike
- """
- u = ss.uniform.rvs(loc=0, scale=1, random_state=random_state)
- t1 = np.where(u < 0.5, np.sqrt(2. * u) * b - b, -np.sqrt(2. * (1. - u)) * b + b)
- return t1
- def rvs(self, size=None, random_state=None):
- """Sample the joint prior."""
- random_state = np.random if random_state is None else random_state
- context = computationContext(size or 1, seed=''global'')
- loaded_net = self.client.load_data(self._rvs_net, batch_index=0)
- # Change to the correct random_state instance
- # Todo: allow passing random_state to computationContext seed
- loaded_net.node[''_random_state''] = {''output'': random_state}
- batch = self.client.compute(loaded_net)
- rvs = np.column_stack([batch[p] for p in self.parameter_names])
- if self.dim == 1:
- rvs = rvs.reshape(size or 1)
- return rvs[0] if size is None else rvs
- def raw_to_floatX(imb, pixel_shift=0.5, square=True, center=False, rng=None):
- rng = rng if rng else np.random
- w,h = imb.shape[2], imb.shape[3] # image size
- x, y = 0,0 # offsets
- if square:
- if w > h:
- if center:
- x = (w-h)/2
- else:
- x = rng.randint(w-h)
- w=h
- elif h > w:
- if center:
- y = (h-w)/2
- else:
- y = rng.randint(h-w)
- h=w
- return nn.utils.floatX(imb)[:,x:x+w,y:y+h]/ 255. - pixel_shift
- # creates and hdf5 file from a dataset given a split in the form {''train'':(0,n)},etc
- # appears to save in unpredictable order,so order must be verified after creation
- def batch_loader(self, rnd_gen=np.random, shuffle=True):
- """load_mbs yields a new minibatch at each iteration"""
- batchsize = self.batchsize
- inds = np.arange(self.n_samples)
- if shuffle:
- rnd_gen.shuffle(inds)
- n_mbs = np.int(np.ceil(self.n_samples / batchsize))
- x = np.zeros(self.X_shape, np.float32)
- y = np.zeros(self.y_shape, np.float32)
- ids = np.empty((batchsize, np.object_)
- for m in range(n_mbs):
- start = m * batchsize
- end = (m + 1) * batchsize
- if end > self.n_samples:
- end = self.n_samples
- mb_slice = slice(start, end)
- x[:end - start, :] = self.x[inds[mb_slice], :]
- y[:end - start, :] = self.y[inds[mb_slice], :]
- ids[:end - start] = self.ids[inds[mb_slice]]
- yield dict(X=x, y=y, ID=ids)
- def batch_loader(self, ID=ids)
- def __init__(self, n_samples, duration, *ops, **kwargs):
- super(Sampler, self).__init__(*ops)
- self.n_samples = n_samples
- self.duration = duration
- random_state = kwargs.pop(''random_state'', None)
- if random_state is None:
- self.rng = np.random
- elif isinstance(random_state, int):
- self.rng = np.random.RandomState(seed=random_state)
- elif isinstance(random_state, np.random.RandomState):
- self.rng = random_state
- else:
- raise ParameterError(''Invalid random_state={}''.format(random_state))
- def __init__(self, data, rng=None):
- if rng is None:
- rng = np.random
- if is_integer(data):
- if data < 1:
- raise ValidationError("Number of dimensions must be a "
- "positive int", attr=''data'', obj=self)
- self.v = rng.randn(data)
- self.v /= np.linalg.norm(self.v)
- else:
- self.v = np.array(data, dtype=float)
- if len(self.v.shape) != 1:
- raise ValidationError("''data'' must be a vector", ''data'', self)
- self.v.setflags(write=False)
- def mahalanobis_norm(self, dx):
- """return Mahalanobis norm based on the current sample
- distribution.
- The norm is based on Covariance matrix ``C`` times ``sigma**2``,
- and includes ``sigma_vec``. The expected Mahalanobis distance to
- the sample mean is about ``sqrt(dimension)``.
- Argument
- --------
- A *genotype* difference `dx`.
- Example
- -------
- >>> import cma,1) #doctest: +ELLIPSIS
- (5_w,...
- >>> xx = numpy.random.randn(2,because C = I and sigma = 1.
- """
- return self.sm.norm(np.asarray(dx) / self.sigma_vec.scaling) / self.sigma
- def get_rng():
- """Get the package-level random number generator.
- Returns
- -------
- :class:`numpy.random.RandomState` instance
- The :class:`numpy.random.RandomState` instance passed to the most
- recent call of :func:`set_rng`,or ``numpy.random`` if :func:`set_rng`
- has never been called.
- """
- return _rng
- def set_rng(rng):
- """Set the package-level random number generator.
- Parameters
- ----------
- new_rng : ``numpy.random`` or a :class:`numpy.random.RandomState` instance
- The random number generator to use.
- """
- global _rng
- _rng = rng
- def set_seed(seed):
- """Set numpy seed.
- Parameters
- ----------
- seed : int
- """
- global _rng
- _rng = np.random.RandomState(seed)
- def __init__(self, db, keys, rng=np.random):
- super(DataIterator, self).__init__()
- self.db = db
- self.keys = keys
- self.rng = rng
- # If there is only one key,wrap it in a list
- if isinstance(self.keys, str):
- self.keys = [self.keys]
- # Retrieve the data specification (shape & dtype) for all data objects
- # Assumes that all samples have the same shape and data type
- self.spec = db.get_data_specification(0)
- def __init__(self, batch_size, shuffle=False, endless=True,
- rng=np.random):
- super(SimpleBatch, self).__init__(db, rng)
- self.batch_size = batch_size
- self.shuffle = shuffle
- self.endless = endless
- # Set up Python generator
- self.gen = self.batch()
- def __init__(self,
- rng=np.random):
- super(SimpleBatchThreadSafe,
- shuffle, endless, rng)
- def __init__(self, rng=np.random):
- super(stochasticBatch, rng)
- self.batch_size = batch_size
- # Set up Python generator
- self.gen = self.batch()
- def __init__(self, rng=np.random):
- super(stochasticBatchThreadSafe,
- rng)
- def random_lowrank(rows, cols, dtype=np.float_):
- """Returns a random lowrank matrix of given shape and dtype"""
- if dtype == np.float_:
- A = randstate.randn(rows, rank)
- B = randstate.randn(cols, rank)
- elif dtype == np.complex_:
- A = randstate.randn(rows, rank) + 1.j * randstate.randn(rows, rank) + 1.j * randstate.randn(cols, rank)
- else:
- raise ValueError("{} is not a valid dtype".format(dtype))
- C = A.dot(B.conj().T)
- return C / np.linalg.norm(C)
- def random_fullrank(rows, **kwargs):
- """Returns a random matrix of given shape and dtype. Should provide
- same interface as random_lowrank"""
- kwargs.pop(''rank'', None)
- return random_lowrank(rows, min(rows, cols), **kwargs)
- def _zrandn(shape, randstate=None):
- """Shortcut for :code:`np.random.randn(*shape) + 1.j *
- np.random.randn(*shape)`
- :param randstate: Instance of np.radom.RandomState or None (which yields
- the default np.random) (default None)
- """
- randstate = randstate if randstate is not None else np.random
- return randstate.randn(*shape) + 1.j * randstate.randn(*shape)
- def _randn(shape, randstate=None):
- """Shortcut for :code:`np.random.randn(*shape)`
- :param randstate: Instance of np.radom.RandomState or None (which yields
- the default np.random) (default None)
- """
- randstate = randstate if randstate is not None else np.random
- return randstate.randn(*shape)
- def _random_state(sites, randstate=None):
- """Returns a random positive semidefinite operator of shape (ldim,ldim) *
- sites normalized to Tr rho = 1,i.e. a mixed state with local dimension
- `ldim` living on `sites` sites. Note that the returned state is positive
- semidefinite only when interpreted in global form (see
- :func:`tools.global_to_local`)
- :param sites: Number of local sites
- :param ldim: Local ldimension
- :param randstate: numpy.random.RandomState instance or None
- :returns: numpy.ndarray of shape (ldim,ldim) * sites
- >>> from numpy.linalg import eigvalsh
- >>> rho = _random_state(3,2).reshape((2**3,2**3))
- >>> all(eigvalsh(rho) >= 0)
- True
- >>> np.abs(np.trace(rho) - 1) < 1e-6
- True
- """
- shape = (ldim**sites, ldim**sites)
- mat = _zrandn(shape, randstate=randstate)
- rho = np.conj(mat.T).dot(mat)
- rho /= np.trace(rho)
- return rho.reshape((ldim,) * 2 * sites)
- ####################################
- # Factory functions for MPArrays #
- ####################################
9. 获得图片路径,构造出训练集和验证集,同时构造出相同人脸和不同人脸的测试集,将结果存储为.csv 格式 1.random.shuffle (数据清洗) 2.random.sample (从数据...
1. random.shuffle (dataset) 对数据进行清洗操作
参数说明:dataset 表示输入的数据
2.random.sample (dataset, 2) 从 dataset 数据集中选取 2 个数据
参数说明:dataset 是数据,2 表示两个图片
3. random.choice (dataset) 从数据中随机抽取一个数据
参数说明: dataset 表示从数据中抽取一个数据
4. pickle.dump ((v1,v2), f_path,pickle.HIGHEST_PROTOCOL) 将数据集写成.pkl 数据
参数说明: (v1, v2) 表示数据集,f_path 打开的 f 文件, pickle.HIGHEST_PROTOCOL 保存的格式
代码说明:将图片的路径进行添加,取前 50 张构造出验证集,后 550 构造出训练集,对于小于 100 张的 people_picture, 用于构造出测试集,每一个人脸的数据集构造出的相同人脸和不同人脸的数目为各 5 对,最后将结果保存在 csv 文件中
第一步:使用 os.listdir 获取图片的路径,将低于 100 张的添加到测试集,将 600 张的图片的其中 50 张添加到验证集,其中的 550 张添加到训练集,这里每一个 people 都对应一个 label
第二步:使用 test_pair_generate 用于生成相同人脸数据集和不相同人脸数据集的制作
第三步:使用 random.shuffle 进行数据清洗,然后将路径保存为 csv 文件格式
# -*- coding: utf-8 -*-
''''''
Created on 2019/7/8/0008 9:29
@Author : Sheng1994
''''''
import os
import numpy as np
import random
import pickle
def test_pair_generate(test_image_list, each_k=5):
test_paris_list = []
test_images_length = len(test_image_list)
for people_index, people_images in enumerate(test_image_list):
# 生成相同一对的脸
for _ in range(each_k):
same_paris = random.sample(people_images, 2)
test_paris_list.append((same_paris[0], same_paris[1], 1))
# 生成不同的一对脸
for _ in range(each_k):
index_random = people_index
while index_random == people_index:
index_random = random.randint(0, test_images_length)
diff_one = random.choice(test_image_list[people_index])
diff_another = random.choice(test_image_list[index_random])
test_paris_list.append((diff_one, diff_another, 0))
return test_paris_list
def save_to_pkl(path, v1, v2):
pkl_file = open(path, ''wb'')
pickle.dump((v1, v2), pkl_file, pickle.HIGHEST_PROTOCOL)
pkl_file.close()
def build_dataset(source_folder):
# 第一步:将数据的路径进行添加,对于训练集和验证集的数据其标签使用label+ 来表示,对于测试集的数据使用相同和不同人脸数据集进行表示
label = 1
train_dataset, valid_dataset, test_dataset = [], [], []
counter = 0
test_pair_counter = 0
train_counter = 0
for people_folder in os.listdir(source_folder):
people_images = []
people_folder_path = source_folder + os.sep + people_folder
for vedio_folder in os.listdir(people_folder_path):
vedio_folder_path = people_folder_path + os.sep + vedio_folder
for vedio_file_name in os.listdir(vedio_folder_path):
full_path = vedio_folder_path + os.sep + vedio_file_name
people_images.append(full_path)
random.shuffle(people_images)
if len(people_images) < 100:
test_dataset.append(people_images)
test_pair_counter += 1
else:
valid_dataset.extend(zip(people_images[0:50], [label]*50))
test_dataset.extend(zip(people_images[50:600], [label]*550))
label += 1
train_counter += 1
print(people_folder +'': id--->'' + str(counter))
counter += 1
# 将train和test数据集的个数表示下来
save_to_pkl(''image/train_test_number.pkl'', train_counter, test_pair_counter)
# 第二步:对测试数据进行生成,产生各5组的相同人脸数据集和不同人脸数据集
test_pairs_dataset = test_pair_generate(test_dataset, each_k=5)
random.shuffle(train_dataset)
random.shuffle(valid_dataset)
random.shuffle(test_pairs_dataset)
return train_dataset, valid_dataset, test_pairs_dataset
def save_to_csv(dataset, file_name):
with open(file_name, "w") as f:
for item in dataset:
f.write(",".join(map(str, item)) + "\n")
def run():
random.seed(7)
train_dataset, valid_dataset, test_dataset = build_dataset(''image\\result'')
# 第三步:数据清洗,并将数据集存储到train_dataset_path
train_dataset_path = ''image\\train_dataset.csv''
valid_dataset_path = ''image\\valid_dataset.csv''
test_dataset_path = ''image\\test_dataset.csv''
save_to_csv(train_dataset, train_dataset_path)
save_to_csv(valid_dataset, valid_dataset_path)
save_to_csv(test_dataset, test_dataset_path)
if __name__ == ''__main__'':
run()
dask.array.from_array(np.random.random) 和 dask.array.random.random 有什么区别
如何解决dask.array.from_array(np.random.random) 和 dask.array.random.random 有什么区别
遇到一种情况,我们需要训练一堆数据(大约22GiB),我用两种生成随机数据的方法进行了测试,并尝试用dask对其进行训练,但是Numpy生成的数据会引发异常(msgpack:字节对象太大)而 dask.array 一个工作。有人知道为什么吗?
from dask.distributed import Client
from dask_cuda import LocalCUDACluster
from dask import array as da
import numpy as np
import xgboost as xgb
import time
def main(client):
regressor = None
pre = None
n=3000
m=1000000
# numpy generated data will raise an exception
X = np.random.random((m,n))
y = np.random.random((m,1))
X = da.from_array(X,chunks=(1000,n))
y = da.from_array(y,1))
# data generated by dask.array works well
# X = da.random.random(size=(m,n),n))
# y = da.random.random(size=(m,1),1))
dtrain = xgb.dask.daskDMatrix(client,X,y)
del X
del y
params = {''tree_method'':''gpu_hist''}
watchlist = [(dtrain,''train'')]
start = time.time()
bst = xgb.dask.train(client,params,dtrain,num_boost_round=100,evals=watchlist)
print(''consume:'',time.time() - start)
if __name__ == ''__main__'':
with LocalCUDACluster(n_workers=4,device_memory_limit=''12 GiB'') as cluster:
with Client(cluster) as client:
main(client)
解决方法
经过几次测试,我找到了原因,da.random.random也是一个延迟函数(所以它只通过worker的定义是random),在我们的情况下,msgpack限制了数据大小( 4GiB) 传递给每个worker,因此,一般来说,对于超过 4GiB 的数据大小,它不会直接与 Dask XGBoost 通信(顺便说一句,我们可以切换到 parquet 数据并将其作为 dash.dataframe 块数据读取以绕过限制msgpack)
以下命令证明了我的猜测。
Math.random () 和 Random.nextInt () 区别
package cn.wangbingan.vip;
import java.util.Random;
/**
* Math.random()和Random.nextInt()区别
*
* @author AK
*
*/
public class RandomTest {
public static void main(String[] args) {
// TODO Auto-generated method stub
// 随机数对象
Random random = new Random();
// 开始时间
long startTime1 = System.nanoTime();
// 生成随机数
long a = random.nextInt(10000);
// 结束时间
long endTime1 = System.nanoTime();
// 耗时时间
long time1 = endTime1 - startTime1;
System.out.println("生成随机数:" + a + "=>Random耗时:" + time1);
// 开始时间
long startTime2 = System.nanoTime();
// 生成随机数
int b = (int) (Math.random() * 10000);
// 结束时间
long endTime2 = System.nanoTime();
// 耗时时间
long time2 = endTime2 - startTime2;
System.out.println("生成随机数:" + b + "=>Math耗时:" + time2);
}
}
输出结果:
生成随机数:9441=>Random 耗时:11000
生成随机数:7109=>Math 耗时:43000
前者生成的随机数效率高于后者,时间上前者大约是后者 50% 到 80% 的时间,可能还要高.
造成这个原因如下:
Math.random () 是 Random.nextDouble()的一个内部方法.(所以肯定爸爸的效率高于儿子了)
Random.nextDouble()使用 Random.next()两次,均匀的分布范围为 0 到 1 - (2 ^ -53).
Random.nextInt(n)的使用 Random.next()不多于两次,返回值范围为 0 到 n - 1 的分布
np.random.randn()、np.random.rand()、np.random.randint()
(1)np.random.randn()函数
语法:
np.random.randn(d0,d1,d2……dn)
1)当函数括号内没有参数时,则返回一个浮点数;
2)当函数括号内有一个参数时,则返回秩为1的数组,不能表示向量和矩阵;
3)当函数括号内有两个及以上参数时,则返回对应维度的数组,能表示向量或矩阵;
4)np.random.standard_normal()函数与np.random.randn()类似,但是np.random.standard_normal()的输入参数为元组(tuple).
5)np.random.randn()的输入通常为整数,但是如果为浮点数,则会自动直接截断转换为整数。
作用:
通过本函数可以返回一个或一组服从标准正态分布的随机样本值。
特点:
标准正态分布是以0为均数、以1为标准差的正态分布,记为N(0,1)。对应的正态分布曲线如下所示,即
标准正态分布曲线下面积分布规律是:
在-1.96~+1.96范围内曲线下的面积等于0.9500(即取值在这个范围的概率为95%),在-2.58~+2.58范围内曲线下面积为0.9900(即取值在这个范围的概率为99%).
因此,由 np.random.randn()函数所产生的随机样本基本上取值主要在-1.96~+1.96之间,当然也不排除存在较大值的情形,只是概率较小而已。在神经网络构建中,权重参数W通常采用该函数进行初始化,当然需要注意的是,通常会在生成的矩阵后面乘以小数,比如0.01,目的是为了提高梯度下降算法的收敛速度。
W = np.random.randn(2,2)*0.01
import numpy as np
arr1 = np.random.randn(2,4)
print(arr1)
print(''******************************************************************'')
arr2 = np.random.rand(2,4)
print(arr2)
1
2
3
4
5
6
7
结果:[[-1.03021018 0.5197033 0.52117459 -0.70102661]
[ 0.98268569 1.21940697 -1.095241 -0.38161758]]
******************************************************************
[[ 0.19947349 0.05282713 0.56704222 0.45479972]
[ 0.28827103 0.1643551 0.30486786 0.56386943]](2) np.random.rand()函数
语法:
np.random.rand(d0,d1,d2……dn)
注:使用方法与np.random.randn()函数相同
作用:
通过本函数可以返回一个或一组服从“0~1”均匀分布的随机样本值。随机样本取值范围是[0,1),不包括1。
应用:在深度学习的Dropout正则化方法中,可以用于生成dropout随机向量(dl),例如(keep_prob表示保留神经元的比例):dl = np.random.rand(al.shape[0],al.shape[1]) < keep_probimport numpy as np
arr1 = np.random.randn(2,4)
print(arr1)
print(''******************************************************************'')
arr2 = np.random.rand(2,4)
print(arr2)
1
2
3
4
5
6
7
结果:[[-1.03021018 0.5197033 0.52117459 -0.70102661]
[ 0.98268569 1.21940697 -1.095241 -0.38161758]]
******************************************************************
[[ 0.19947349 0.05282713 0.56704222 0.45479972]
[ 0.28827103 0.1643551 0.30486786 0.56386943]]
---------------------
作者:木子木泗
来源:CSDN
原文:https://blog.csdn.net/u010758410/article/details/71799142
版权声明:本文为博主原创文章,转载请附上博文链接!(3) np.random.randint()函数
语法:
numpy.random.randint(low, high=None, size=None, dtype=’l’)
输入:
low—–为最小值
high—-为最大值
size—–为数组维度大小
dtype—为数据类型,默认的数据类型是np.int。
返回值:
返回随机整数或整型数组,范围区间为[low,high),包含low,不包含high;
high没有填写时,默认生成随机数的范围是[0,low)
在使用Python进行数据处理时,往往需要用到大量的随机数据,那如何构造这么多数据呢?Python的第三方库numpy库中提供了random函数来实现这个功能。
本文将根据官方文档以及其他博友的博客一起来谈论常见的random函数以及使用
官方文档首先说下numpy.random.seed()与numpy.random.RandomState()这两个在数据处理中比较常用的函数,两者实现的作用是一样的,都是使每次随机生成数一样,具体可见下图
1.numpy.random.rand()
官方文档中给出的用法是:numpy.random.rand(d0,d1,…dn)
以给定的形状创建一个数组,并在数组中加入在[0,1]之间均匀分布的随机样本。
用法及实现:
2.numpy.random.randn()
官方文档中给出的用法是:numpy.random.rand(d0,d1,…dn)
以给定的形状创建一个数组,数组元素来符合标准正态分布N(0,1)
若要获得一般正态分布则可用sigma * np.random.randn(…) + mu进行表示
用法及实现:
3.numpy.random.randint()
官方文档中给出的用法是:numpy.random.randint(low,high=None,size=None,dtype)
生成在半开半闭区间[low,high)上离散均匀分布的整数值;若high=None,则取值区间变为[0,low)
用法及实现
high=None的情形
high≠None
4.numpy.random.random_integers()
官方文档中给出的用法是:
numpy.random.random_integers(low,high=None,size=None)
生成闭区间[low,high]上离散均匀分布的整数值;若high=None,则取值区间变为[1,low]
用法及实现
high=None的情形
high≠None的情形
此外,若要将【a,b】区间分成N等分,也可以用此函数实现
a+(b-a)*(numpy.random.random_integers(N)-1)/(N-1)5.numpy.random_sanmple()
官方文档中给出的用法是:
numpy.random.random_sample(size=None)
以给定形状返回[0,1)之间的随机浮点数
用法及实现
其他函数,numpy.random.random() ;numpy.random.ranf()
numpy.random.sample()用法及实现都与它相同6.numpy.random.choice()
官方文档中给出的用法:
numpy.random.choice(a,size=None,replace=True,p=None)
若a为数组,则从a中选取元素;若a为单个int类型数,则选取range(a)中的数
replace是bool类型,为True,则选取的元素会出现重复;反之不会出现重复
p为数组,里面存放选到每个数的可能性,即概率
用法及实现
以上就是关于random函数的几种用法,欢迎大家一起交流
---------------------
作者:冻鸡hhhh
来源:CSDN
原文:https://blog.csdn.net/m0_38061927/article/details/75335069
版权声明:本文为博主原创文章,转载请附上博文链接!
今天的关于Python numpy 模块-random() 实例源码和numpy的random模块的分享已经结束,谢谢您的关注,如果想了解更多关于9. 获得图片路径,构造出训练集和验证集,同时构造出相同人脸和不同人脸的测试集,将结果存储为.csv 格式 1.random.shuffle (数据清洗) 2.random.sample (从数据...、dask.array.from_array(np.random.random) 和 dask.array.random.random 有什么区别、Math.random () 和 Random.nextInt () 区别、np.random.randn()、np.random.rand()、np.random.randint()的相关知识,请在本站进行查询。
本文标签: