Keras: Gradient Descent For Machine Learning - Revision history

Onnowpurbo: /* Summary */

2019-09-08T03:44:56Z

Summary

← Older revision		Revision as of 03:44, 8 September 2019
Line 104:		Line 104:
	* Optimisasi adalah bagian terbesar dari machine learning.		* Optimisasi adalah bagian terbesar dari machine learning.
	* Gradient descent adalah prosedur sederhana dari optimisasi that you can use with many machine learning algorithms.		* Gradient descent adalah prosedur sederhana dari optimisasi that you can use with many machine learning algorithms.
	* Batch gradient descent ~~refers to calculating the derivative from all~~ training ~~data before calculating an~~ update.		* Batch gradient descent mengacu pada menghitung turunan dari semua data training sebelum menghitung update.
	* Stochastic gradient descent ~~refers to calculating the derivative from each~~ training ~~data instance and calculating the~~ update ~~immediately~~.		* Stochastic gradient descent mengacu pada menghitung turunan dari setiap instance data training dan menghitung update segera.

	==Referensi==		==Referensi==

Onnowpurbo: /* Summary */

2019-09-08T03:32:54Z

Summary

← Older revision		Revision as of 03:32, 8 September 2019
Line 100:		Line 100:
	==Summary==		==Summary==

	~~In this post you discovered~~ gradient descent ~~for~~ machine learning. ~~You learned that~~:		Dalam tulisan ini anda mempelajari tentang gradient descent untuk machine learning. Anda belajar bahwa:

	* ~~Optimization is a big part of~~ machine learning.		* Optimisasi adalah bagian terbesar dari machine learning.
	* Gradient descent ~~is a simple optimization procedure~~ that you can use with many machine learning algorithms.		* Gradient descent adalah prosedur sederhana dari optimisasi that you can use with many machine learning algorithms.
	* Batch gradient descent refers to calculating the derivative from all training data before calculating an update.		* Batch gradient descent refers to calculating the derivative from all training data before calculating an update.
	* Stochastic gradient descent refers to calculating the derivative from each training data instance and calculating the update immediately.		* Stochastic gradient descent refers to calculating the derivative from each training data instance and calculating the update immediately.


	==Referensi==		==Referensi==

Onnowpurbo: /* Tips untuk Gradient Descent */

2019-09-08T03:31:22Z

Tips untuk Gradient Descent

← Older revision		Revision as of 03:31, 8 September 2019
Line 94:		Line 94:
	* Plot Cost vs Time: Kumpulkan dan plot nilai cost yang dihitung oleh algoritma setiap iterasi. Berharap untuk menjalankan gradient descent yang berkinerja baik adalah penurunan cost setiap iterasi. Jika tidak berkurang, coba kurangi learning rate.		* Plot Cost vs Time: Kumpulkan dan plot nilai cost yang dihitung oleh algoritma setiap iterasi. Berharap untuk menjalankan gradient descent yang berkinerja baik adalah penurunan cost setiap iterasi. Jika tidak berkurang, coba kurangi learning rate.
	* Learning Rate: Nilai learning rate adalah nilai real kecil seperti 0,1, 0,001 atau 0,0001. Coba nilai yang berbeda untuk masalah anda dan lihat mana yang paling berhasil.		* Learning Rate: Nilai learning rate adalah nilai real kecil seperti 0,1, 0,001 atau 0,0001. Coba nilai yang berbeda untuk masalah anda dan lihat mana yang paling berhasil.
	* Rescale ~~Inputs~~: ~~The algorithm will reach the~~ minimum cost ~~faster if the shape of the cost function is not~~ skewed ~~and distorted~~. ~~You can achieved this by rescaling all of the~~ input ~~variables~~ (X) ~~to the same range~~, ~~such as~~ [0, 1] or [-1, 1].		* Rescale Input: Algoritma akan mencapai cost minimum lebih cepat jika bentuk fungsi cost tidak skewed dan terdistorsi. Kita dapat mencapai ini dengan mengubah skala semua variabel input (X) ke rentang yang sama, seperti [0, 1] atau [-1, 1].
	* ~~Few Passes~~: Stochastic gradient descent ~~often does not need more than~~ 1~~-to-~~10 ~~passes through the~~ training ~~dataset to converge on good or good enough coefficients~~.		* Sedikit saja Pass: Stochastic gradient descent sering tidak membutuhkan lebih dari 1 hingga 10 pass pada dataset training untuk bertemu pada koefisien yang baik atau cukup baik.
	* Plot Mean Cost: ~~The updates for each training~~ dataset ~~instance can result in a noisy~~ plot of cost ~~over time when using~~ stochastic gradient descent. ~~Taking the average over~~ 10, 100, or 1000 ~~updates can give you a better idea of the~~ learning trend ~~for the algorithm~~.		* Plot Mean Cost: Pembaruan untuk setiap instance dataset pelatihan dapat menghasilkan plot cost yang noisy dari waktu ke waktu saat menggunakan stochastic gradient descent. Mengambil rata-rata lebih dari 10, 100, atau 1000 update dapat memberi anda ide yang lebih baik dari learning trend untuk algoritma tersebut.

	==Summary==		==Summary==

Onnowpurbo: /* Tips untuk Gradient Descent */

2019-09-08T03:27:53Z

Tips untuk Gradient Descent

← Older revision		Revision as of 03:27, 8 September 2019
Line 92:		Line 92:
	Bagian ini mencantumkan beberapa tip dan trik untuk mendapatkan hasil maksimal dari algoritma gradient descent untuk machine learning.		Bagian ini mencantumkan beberapa tip dan trik untuk mendapatkan hasil maksimal dari algoritma gradient descent untuk machine learning.

	* Plot Cost ~~versus~~ Time: ~~Collect and~~ plot ~~the~~ cost ~~values calculated by the algorithm each iteration~~. ~~The expectation for a well performing~~ gradient descent ~~run is a decrease in~~ cost ~~each iteration~~. ~~If it does not decrease~~, ~~try reducing your~~ learning rate.		* Plot Cost vs Time: Kumpulkan dan plot nilai cost yang dihitung oleh algoritma setiap iterasi. Berharap untuk menjalankan gradient descent yang berkinerja baik adalah penurunan cost setiap iterasi. Jika tidak berkurang, coba kurangi learning rate.
	* Learning Rate: ~~The~~ learning rate ~~value is a small~~ real ~~value such as~~ 0.1, 0.001 or 0.0001. ~~Try different values for your problem and see which works best~~.		* Learning Rate: Nilai learning rate adalah nilai real kecil seperti 0,1, 0,001 atau 0,0001. Coba nilai yang berbeda untuk masalah anda dan lihat mana yang paling berhasil.
	* Rescale Inputs: The algorithm will reach the minimum cost faster if the shape of the cost function is not skewed and distorted. You can achieved this by rescaling all of the input variables (X) to the same range, such as [0, 1] or [-1, 1].		* Rescale Inputs: The algorithm will reach the minimum cost faster if the shape of the cost function is not skewed and distorted. You can achieved this by rescaling all of the input variables (X) to the same range, such as [0, 1] or [-1, 1].
	* Few Passes: Stochastic gradient descent often does not need more than 1-to-10 passes through the training dataset to converge on good or good enough coefficients.		* Few Passes: Stochastic gradient descent often does not need more than 1-to-10 passes through the training dataset to converge on good or good enough coefficients.

Onnowpurbo: /* Tips for Gradient Descent */

2019-09-08T03:20:17Z

Tips for Gradient Descent

← Older revision		Revision as of 03:20, 8 September 2019
Line 88:		Line 88:
	Proses learning bisa jauh lebih cepat dengan stochastic gradient descent untuk dataset training yang sangat besar dan seringkali kita hanya perlu sejumlah kecil lintasan melalui dataset untuk mencapai set koefisien yang baik atau cukup baik, mis. 1-sampai-10 pass melewati dataset.		Proses learning bisa jauh lebih cepat dengan stochastic gradient descent untuk dataset training yang sangat besar dan seringkali kita hanya perlu sejumlah kecil lintasan melalui dataset untuk mencapai set koefisien yang baik atau cukup baik, mis. 1-sampai-10 pass melewati dataset.

	==Tips ~~for~~ Gradient Descent==		==Tips untuk Gradient Descent==

	~~This section lists some tips and tricks for getting the most out of the~~ gradient descent ~~algorithm for~~ machine learning.		Bagian ini mencantumkan beberapa tip dan trik untuk mendapatkan hasil maksimal dari algoritma gradient descent untuk machine learning.

	* Plot Cost versus Time: Collect and plot the cost values calculated by the algorithm each iteration. The expectation for a well performing gradient descent run is a decrease in cost each iteration. If it does not decrease, try reducing your learning rate.		* Plot Cost versus Time: Collect and plot the cost values calculated by the algorithm each iteration. The expectation for a well performing gradient descent run is a decrease in cost each iteration. If it does not decrease, try reducing your learning rate.

Onnowpurbo: /* Stochastic Gradient Descent untuk Machine Learning */

2019-09-08T03:19:37Z

Stochastic Gradient Descent untuk Machine Learning

← Older revision		Revision as of 03:19, 8 September 2019
Line 78:		Line 78:
	Karena satu iterasi dari algoritma gradient descent memerlukan prediksi untuk setiap instance dalam dataset training, ini bisa memakan waktu lama ketika kita memiliki jutaan instance.		Karena satu iterasi dari algoritma gradient descent memerlukan prediksi untuk setiap instance dalam dataset training, ini bisa memakan waktu lama ketika kita memiliki jutaan instance.

	~~In situations when you have large amounts of~~ data, ~~you can use a variation of~~ gradient descent ~~called~~ stochastic gradient descent.		Pada saat kita memiliki data yang besar, kita dapat menggunakan variasi gradient descent yang disebut stochastic gradient descent.

	~~In this variation~~, ~~the~~ gradient descent ~~procedure described above is run but the~~ update ~~to the coefficients is performed for each~~ training ~~instance~~, ~~rather than at the end of the~~ batch ~~of instances~~.		Dalam variasi ini, prosedur gradient descent yang dijelaskan di atas dijalankan tetapi update koefisien dilakukan untuk setiap instance training, bukan pada akhir batch instance.

	~~The first step of the procedure requires that the order of the~~ training ~~dataset is randomized~~. ~~This is to mix up the order that updates are made to the coefficients~~. ~~Because the coefficients are updated after every~~ training instance, ~~the updates will be noisy jumping all over the place~~, ~~and so will the corresponding~~ cost ~~function~~. ~~By mixing up the order for the updates to the coefficients~~, ~~it harnesses this random walk and avoids it getting distracted or stuck~~.		Langkah pertama dari prosedur ini mensyaratkan bahwa urutan dataset training di acak. Ini untuk mengacak urutan update untuk koefisien. Karena koefisien di update setelah setiap training instance, update merupakan melompat acak di semua tempat, dan demikian pula fungsi cost yang sesuai. Dengan mengacak urutan update untuk koefisien, ini akan mengacak jalan dan menghindari akan gangguan atau macet.

	~~The~~ update ~~procedure for the coefficients is the same as that above~~, ~~except the~~ cost ~~is not summed over all~~ training ~~patterns~~, ~~but instead calculated for one~~ training ~~pattern~~.		Prosedur update untuk koefisien sama dengan yang di atas, kecuali cost tidak dijumlahkan pada semua pola training, tetapi dihitung untuk satu pola training.

	~~The~~ learning ~~can be much faster with~~ stochastic gradient descent ~~for very large~~ training ~~datasets and often you only need a small number of passes through the~~ dataset ~~to reach a good or good enough~~ set ~~of coefficients~~, ~~e.g~~. 1-to-10 ~~passes through the~~ dataset.		Proses learning bisa jauh lebih cepat dengan stochastic gradient descent untuk dataset training yang sangat besar dan seringkali kita hanya perlu sejumlah kecil lintasan melalui dataset untuk mencapai set koefisien yang baik atau cukup baik, mis. 1-sampai-10 pass melewati dataset.

	==Tips for Gradient Descent==		==Tips for Gradient Descent==

Onnowpurbo: /* Stochastic Gradient Descent for Machine Learning */

2019-09-08T02:47:04Z

Stochastic Gradient Descent for Machine Learning

← Older revision		Revision as of 02:47, 8 September 2019
Line 72:		Line 72:
	Batch gradient descent adalah bentuk paling umum dari gradient descent yang dijelaskan dalam machine learning.		Batch gradient descent adalah bentuk paling umum dari gradient descent yang dijelaskan dalam machine learning.

	==Stochastic Gradient Descent ~~for~~ Machine Learning==		==Stochastic Gradient Descent untuk Machine Learning==

	Gradient descent ~~can be slow to run on very large datasets~~.		Gradient descent bisa berjalan lambat pada dataset yang sangat besar.

	~~Because one iteration of the~~ gradient descent ~~algorithm requires a prediction for each~~ instance ~~in the~~ training ~~dataset~~, ~~it can take a long time when you have many millions of instances~~.		Karena satu iterasi dari algoritma gradient descent memerlukan prediksi untuk setiap instance dalam dataset training, ini bisa memakan waktu lama ketika kita memiliki jutaan instance.

	In situations when you have large amounts of data, you can use a variation of gradient descent called stochastic gradient descent.		In situations when you have large amounts of data, you can use a variation of gradient descent called stochastic gradient descent.
Line 87:		Line 87:

	The learning can be much faster with stochastic gradient descent for very large training datasets and often you only need a small number of passes through the dataset to reach a good or good enough set of coefficients, e.g. 1-to-10 passes through the dataset.		The learning can be much faster with stochastic gradient descent for very large training datasets and often you only need a small number of passes through the dataset to reach a good or good enough set of coefficients, e.g. 1-to-10 passes through the dataset.


	==Tips for Gradient Descent==		==Tips for Gradient Descent==

Onnowpurbo: /* Batch Gradient Descent untuk Machine Learning */

2019-09-08T02:46:01Z

Batch Gradient Descent untuk Machine Learning

← Older revision		Revision as of 02:46, 8 September 2019
Line 66:		Line 66:
	Evaluasi seberapa dekat model machine learning memperkirakan fungsi target dapat dihitung dengan berbagai cara, seringkali khusus untuk algoritma machine learning. Fungsi cost melibatkan evaluasi koefisien dalam model machine learning dengan menghitung prediksi untuk model untuk setiap contoh training instance dalam dataset dan membandingkan prediksi dengan nilai output aktual dan menghitung jumlah atau kesalahan rata-rata (seperti Sum of Squared Residuals atau SSR dalam hal linear regression).		Evaluasi seberapa dekat model machine learning memperkirakan fungsi target dapat dihitung dengan berbagai cara, seringkali khusus untuk algoritma machine learning. Fungsi cost melibatkan evaluasi koefisien dalam model machine learning dengan menghitung prediksi untuk model untuk setiap contoh training instance dalam dataset dan membandingkan prediksi dengan nilai output aktual dan menghitung jumlah atau kesalahan rata-rata (seperti Sum of Squared Residuals atau SSR dalam hal linear regression).

	~~From the~~ cost ~~function a derivative can be calculated for each coefficient so that it can be updated using exactly the~~ update ~~equation described above~~.		Dari fungsi cost, turunan dapat dihitung untuk setiap koefisien sehingga dapat di update menggunakan persamaan update yang dijelaskan di atas.

	~~The cost is calculated for a~~ machine learning ~~algorithm over the entire~~ training ~~dataset for each iteration of the~~ gradient descent ~~algorithm~~. ~~One iteration of the algorithm is called one~~ batch ~~and this form of~~ gradient descent ~~is referred to as~~ batch gradient descent.		Cost dihitung untuk algoritma machine learning atas seluruh dataset training untuk setiap iterasi dari algoritma gradient descent. Satu iterasi dari algoritma ini disebut satu batch dan bentuk gradient descent ini disebut sebagai batch gradient descent.

	Batch gradient descent ~~is the most common form of~~ gradient descent ~~described in~~ machine learning.		Batch gradient descent adalah bentuk paling umum dari gradient descent yang dijelaskan dalam machine learning.

	==Stochastic Gradient Descent for Machine Learning==		==Stochastic Gradient Descent for Machine Learning==

Onnowpurbo: /* Batch Gradient Descent untuk Machine Learning */

2019-09-08T02:13:14Z

Batch Gradient Descent untuk Machine Learning

← Older revision		Revision as of 02:13, 8 September 2019
Line 62:		Line 62:
	Beberapa algoritma machine learning memiliki koefisien yang mencirikan estimasi algoritma untuk fungsi target (f). Algoritma yang berbeda memiliki representasi yang berbeda dan koefisien yang berbeda, tetapi banyak dari mereka memerlukan proses optimasi untuk menemukan set koefisien yang menghasilkan estimasi terbaik dari fungsi target.		Beberapa algoritma machine learning memiliki koefisien yang mencirikan estimasi algoritma untuk fungsi target (f). Algoritma yang berbeda memiliki representasi yang berbeda dan koefisien yang berbeda, tetapi banyak dari mereka memerlukan proses optimasi untuk menemukan set koefisien yang menghasilkan estimasi terbaik dari fungsi target.

	~~Common examples of algorithms with coefficients that can be optimized using~~ gradient descent ~~are~~ Linear Regression ~~and~~ Logistic Regression.		Contoh umum dari algoritma dengan koefisien yang dapat dioptimalkan menggunakan gradient descent adalah Linear Regression dan Logistic Regression.

	~~The evaluation of how close a fit a~~ machine learning ~~model estimates the~~ target ~~function can be calculated a number of different ways~~, ~~often specific to the~~ machine learning ~~algorithm~~. ~~The~~ cost ~~function involves evaluating the coefficients in the~~ machine learning model ~~by calculating a prediction for the model for each~~ training instance ~~in the~~ dataset ~~and comparing the predictions to the actual~~ output ~~values and calculating a sum or average error~~ (~~such as the~~ Sum of Squared Residuals or SSR ~~in the case of~~ linear regression).		Evaluasi seberapa dekat model machine learning memperkirakan fungsi target dapat dihitung dengan berbagai cara, seringkali khusus untuk algoritma machine learning. Fungsi cost melibatkan evaluasi koefisien dalam model machine learning dengan menghitung prediksi untuk model untuk setiap contoh training instance dalam dataset dan membandingkan prediksi dengan nilai output aktual dan menghitung jumlah atau kesalahan rata-rata (seperti Sum of Squared Residuals atau SSR dalam hal linear regression).

	From the cost function a derivative can be calculated for each coefficient so that it can be updated using exactly the update equation described above.		From the cost function a derivative can be calculated for each coefficient so that it can be updated using exactly the update equation described above.

Onnowpurbo: /* Batch Gradient Descent for Machine Learning */

2019-09-07T04:48:17Z

Batch Gradient Descent for Machine Learning

← Older revision		Revision as of 04:48, 7 September 2019
Line 56:		Line 56:
	Kita dapat melihat bagaimana sederhananya gradient descent. Itu memang mengharuskan kita untuk mengetahui gradien dari fungsi cost kita atau fungsi yang kita optimalkan, tetapi selain itu, itu sangat mudah. Selanjutnya kita akan melihat bagaimana kita dapat menggunakan ini dalam algoritma machine learning.		Kita dapat melihat bagaimana sederhananya gradient descent. Itu memang mengharuskan kita untuk mengetahui gradien dari fungsi cost kita atau fungsi yang kita optimalkan, tetapi selain itu, itu sangat mudah. Selanjutnya kita akan melihat bagaimana kita dapat menggunakan ini dalam algoritma machine learning.

	==Batch Gradient Descent ~~for~~ Machine Learning==		==Batch Gradient Descent untuk Machine Learning==

	~~The goal of all~~ supervised machine learning ~~algorithms is to best estimate a~~ target ~~function~~ (f) ~~that maps~~ input ~~data~~ (X) ~~onto~~ output ~~variables~~ (Y). ~~This describes all classification and regression problems~~.		Tujuan dari semua algoritma supervised machine learning adalah untuk memperkirakan fungsi target (f) terbaik yang memetakan data input (X) ke variabel output (Y). Ini menjelaskan semua masalah klasifikasi dan regresi.

	~~Some~~ machine learning ~~algorithms have coefficients that characterize the algorithms estimate for the~~ target ~~function~~ (f). ~~Different algorithms have different representations and different coefficients~~, ~~but many of them require a process of optimization to find the~~ set ~~of coefficients that result in the best estimate of the~~ target ~~function~~.		Beberapa algoritma machine learning memiliki koefisien yang mencirikan estimasi algoritma untuk fungsi target (f). Algoritma yang berbeda memiliki representasi yang berbeda dan koefisien yang berbeda, tetapi banyak dari mereka memerlukan proses optimasi untuk menemukan set koefisien yang menghasilkan estimasi terbaik dari fungsi target.

	Common examples of algorithms with coefficients that can be optimized using gradient descent are Linear Regression and Logistic Regression.		Common examples of algorithms with coefficients that can be optimized using gradient descent are Linear Regression and Logistic Regression.